TNRC18 engages H3K9me3 to mediate silencing of endogenous retrotransposons

Trimethylation of histone H3 lysine 9 (H3K9me3) is crucial for the regulation of gene repression and heterochromatin formation, cell-fate determination and organismal development1. H3K9me3 also provides an essential mechanism for silencing transposable elements1–4. However, previous studies have shown that canonical H3K9me3 readers (for example, HP1 (refs. 5–9) and MPP8 (refs. 10–12)) have limited roles in silencing endogenous retroviruses (ERVs), one of the main transposable element classes in the mammalian genome13. Here we report that trinucleotide-repeat-containing 18 (TNRC18), a poorly understood chromatin regulator, recognizes H3K9me3 to mediate the silencing of ERV class I (ERV1) elements such as LTR12 (ref. 14). Biochemical, biophysical and structural studies identified the carboxy-terminal bromo-adjacent homology (BAH) domain of TNRC18 (TNRC18(BAH)) as an H3K9me3-specific reader. Moreover, the amino-terminal segment of TNRC18 is a platform for the direct recruitment of co-repressors such as HDAC–Sin3–NCoR complexes, thus enforcing optimal repression of the H3K9me3-demarcated ERVs. Point mutagenesis that disrupts the TNRC18(BAH)-mediated H3K9me3 engagement caused neonatal death in mice and, in multiple mammalian cell models, led to derepressed expression of ERVs, which affected the landscape of cis-regulatory elements and, therefore, gene-expression programmes. Collectively, we describe a new H3K9me3-sensing and regulatory pathway that operates to epigenetically silence evolutionarily young ERVs and exert substantial effects on host genome integrity, transcriptomic regulation, immunity and development. Trinucleotide-repeat-containing 18 (TNRC18), which has poorly understood functions, is now identified as an H3K9me3-specific reader that silences endogenous retroviruses.


TNRC18 mediates the silencing of ERV1
To determine putative TE regulatory roles of TNRC18, we performed TNRC18 knockdown (KD) in HEK293 cells and subsequent TNRC18 proteins in HEK293 cells, with hits ranked by the fold change of normalized spectral abundance factor (NSAF) relative to control.Label font colors: green, zinc finger proteins; blue, histone and DNA-methylation-related factors; yellow, histone chaperones; purple, N6-methyladenosine-related factors; red, TNRC18.b,c, Scatter plots showing TE families exhibiting expression changes based on RNA-seq of HEK293 cells with endogenous TNRC18 KD (using TNRC18 shRNA (shTNRC18); b) relative to TNRC18 KD followed by TNRC18 re-expression (shTNRC18-rescue; n = 2 independent experiments), or cells with endogenous TNRC18 KO (TNRC18-KO; c) relative to WT (n = 3 independent experiments).The significance cut-off is the fold change in expression over 1.50 and adjusted P value less than 0.01 for transcripts with base mean read counts over 10.Adjusted P value was calculated using negative binomial model-based methods (DESeq2).d, RT-qPCR for TEs (top) and immunity-related genes (bottom) in HEK293 cells with shTNRC18 versus control shRNA (shControl) or shTNRC18-rescue, or TNRC18 KO versus WT (n = 3 independent experiments; plotted as mean ± s.d. after normalization to GAPDH and control samples).*P < 0.05; **P < 0.01; ***P < 0.001; ****P < 0.0001, two-sided t-test.Exact P values are shown in Supplementary Table 8. e,f, GSEA revealed pathway enrichment in cells with shTNRC18 versus shControl (e), or TNRC18 KO versus WT (f).Immunity-related gene sets are labelled in red.The y axis and x axis show normalized enrichment score (NES) and false discovery rate (FDR) q values, respectively.g, RT-qPCR for TEs in the indicated TNRC18 KO cells versus WT cells (n = 3 independent experiments; plotted as the mean ± s.d. after normalization to GAPDH and to WT). *P < 0.05; **P < 0.01; ***P < 0.001; ****P < 0.0001, two-sided t-test.Exact P values are shown in Supplementary Table 8. h,i, RNA-seq revealed TEs exhibiting expression changes in SNU-1 cells (h) and HT-29 cells (i) with TNRC18 KO versus WT (n = 3 independent experiments).Significance cut-off is the same as in b,c.Adjusted P values were calculated using negative binomial model-based methods (DESeq2).
rescue by re-expressing exogenous TNRC18, which carries synonymous mutations designed to confer resistance to TNRC18-targeting short hairpin RNA (shRNA) (Extended Data Fig. 1f).Because traditional shRNA-based strategies only produced partial TNRC18 KD (Extended Data Fig. 1f), we also used a CRISPR-Cas9-based approach to achieve TNRC18 knockout (KO) in the same cells (Extended Data Figs.2a,b).Subsequent RNA sequencing (RNA-seq)-based analyses of TEs revealed that among the various TE families, the ones most re-activated after TNRC18 loss (either KD or KO) compared with mock included LTR12 and LTR7, and that such reactivated ERVs became re-silenced following TNRC18 re-expression (Fig. 1b,c, Extended Data Fig. 1g and Supplementary Table 2).LTR12 is abundant, with more than 6,000 copies in the human genome, and evolutionarily young 14 (Extended Data Fig. 1h).Also, both full-length proviruses HERV9NC-int and solitary LTRs (for example, LTR12B, LTR12C and LTR12F, with their internal viral sequences deleted owing to homologous recombination events) of the LTR12 family were repressed by TNRC18.Notably, compared with endogenous TNRC18 expression, that of the exogenously introduced TNRC18 for rescue was significantly higher (Extended Data Fig. 1f), which led to more substantial silencing of ERV1 subfamilies and other TE families such as ERV class II (Fig. 1b).RNA-seq analysis using both unique mapping reads and multi-mapping reads showed consistent results (Extended Data Fig. 2c).Reverse transcription followed by quantitative PCR (RT-qPCR) confirmed de-repression of the above ERVs following TNRC18 loss (KD or KO) and re-silencing of ERVs following TNRC18 re-expression (Fig. 1d,g).Thus, TNRC18 dose-dependently silences ERV1.
We next surveyed publicly available datasets and found that TNRC18 is widely expressed among various tissues (Extended Data Fig. 3a).Such ubiquitous TNRC18 expression suggests that this protein has a more general role that goes beyond HEK293 cells.To test this notion, we used independent designs of TNRC18-targeting sgRNAs and generated TNRC18 KO in four different human cancer cell lines of epithelial origin: 22Rv1, a human prostate carcinoma epithelial cell line; HT-29, a human colorectal adenocarcinoma line; SNU-1, a human gastric carcinoma line; and NCI-H23, a human lung adenocarcinoma line (Extended Data Figs.3b,c).RT-qPCR showed that TNRC18 KO caused consistent and strong reactivation of LTR12 and LTR7 family TEs in all tested cell lines (Fig. 1g).The degree of TNRC18 KO-induced TE reactivation in these cells was either comparable with what was seen in HEK293 cells or more substantial, such as activation of LTR12, HERV9NC-int and/ or LTR7 by approximately 3-6-fold (Fig. 1d,g).RNA-seq of SNU-1 and NCI-H23 cells before and after TNRC18 KO also showed global activation of ERVs (Figs. 1h,i and Supplementary Table 2), as seen with TNRC18 KO in HEK293 cells.

TNRC18 loss leads to immune activation
Retrotransposon activation generates double-stranded RNAs that stimulate interferon-responsive genes, a process known as viral mimicry 29,30 .Gene set enrichment analysis (GSEA) revealed that immunity-related pathways were upregulated following TNRC18 KD, and such activation of immunity-related genes was reversed after restoration of TNRC18 expression (Fig. 1e, Extended Data Figs.2d,e and Supplementary Table 3).RT-qPCR verified that TNRC18 loss substantially activated the transcription of immunity-related genes such as IL11, IL18 and TLR4 (Fig. 1d, bottom).Analyses of RNA-seq profiles of three different human cells (HEK293, SNU-1 and HT-29) also showed that there was significant, consistent activation of immunity-related genes (Figs. 1d-f, Extended Data Figs.2e,f and 3d and Supplementary Table 3).Depletion of the HUSH complex can cause activation of interferon-stimulated genes owing to derepressed LINE1s 4 .Here TNRC18 loss leads to preferential derepression of ERVs, which similarly induces viral mimicry.

TNRC18 directly associates with ERVs
To determine genomic binding of TNRC18 in cells, we performed cleavage under targets and release using nuclease (CUT&RUN) for TNRC18 using independent approaches.We used either endogenous TNRC18 antibody in parental HEK293 cells or GFP antibody in cells stably expressing GFP-tagged TNRC18.CUT&RUN for endogenous TNRC18 and GFP-TNRC18 produced strong and highly correlated profiles, thereby confirming the validity of the mapped binding sites (Pearson r 2 = 0.82; Extended Data Fig. 4a).A significant proportion of TNRC18 peaks were called as TE-bound (for details, see Methods).Approximately 50% and 31% of these most confident TNRC18-bound TEs were annotated as SINEs and ERVs, respectively (Supplementary Fig. 4b).Consistently, subsequent enrichment analyses of TNRC18 binding also revealed that it was most enriched at ERVs, notably ERV1 (Fig. 2a and Extended Data Fig. 4c; columns in red and green), and at SINEs but not other TEs, including LINEs, satellite DNA and simple repeats (Fig. 2a and Extended Data Fig. 4c).CUT&RUN for TNRC18 in HeLa and K562 cells produced overall similar binding patterns (Extended Data Fig. 4d), as exemplified by what was detected at ERVs (LTR12C and MER52C) and promoter or intergenic target sites (Extended Data Fig. 4e).This result indicates that there is conservation in genomic targeting of TNRC18 across different cell types.CUT&RUN for H3K9me3 with independent antibodies showed that H3K9me3 overlapped with TE-associated TNRC18 peaks (Extended Data Fig. 4f), particularly those located at ERV regions (Extended Data Fig. 4g).
Next, we examined TNRC18 and H3K9me3 binding across all ERV subfamilies of TEs.Most sites that showed dual binding of TNRC18 and H3K9me3 belong to ERV1, such as LTR12C, LTR12D and LTR12E (Fig. 2b, blue).By contrast, certain subfamilies of ERV class II and III TEs exhibited high H3K9me3 positivity but had low enrichment for TNRC18 binding (Fig. 2b, orange and yellow).Note that our RNA-seq-based profiling identified that LTR12 family TEs were the most silenced by TNRC18, which is in line with the observation that they were the most strongly bound by TNRC18 and H3K9me3 based on CUT&RUN (Fig. 2b,c).Unbiased motif analysis of TNRC18 peaks revealed the most enrichments for NFY (CCAAT) and Sp (CCCCACC-like) motifs (Extended Data Fig. 4h and Fig. 2d), which are typical motifs in the U3 region of LTR12 (ref.31).This result raises the possibility that certain DNA-sequence-specific binding factors are involved in directing preferential targeting of TNRC18 to ERVs.Pearson correlation revealed that TNRC18 binding was positively correlated with H3K9me3 at LTR12 regions (Fig. 2e), as exemplified by what was seen at a LTR12D copy in chromosome 13 (Fig. 2f).Together, these data demonstrate that TNRC18 binds H3K9me3-demarcated ERVs, particularly LTR12.
Structural comparisons of the TNRC18(BAH)-H3K9me3 complex with previously reported BAH-histone complexes 32,34,36 revealed that  BAH domains present a similar surface region for histone recognition, but with highly diverse interaction mechanisms (Extended Data Fig. 6e).Of note, BAHCC1(BAH) forms sequence-specific interactions with residues H3S28 and H3P30 of the H3K27me3-containing peptide (Extended Data Fig. 6e), in contrast to the TNRC18(BAH)-H3T6 interaction (Fig. 3g).Thus, BAH domains have evolved into a versatile family of histone readers.
It is worth noting that structural superposition of TNRC18(BAH) with the nucleosome core particle (NCP)-bound yeast Sir3 BAH domain (Sir3(BAH); Protein Data Bank (PDB) identifier 4KUD) 38 revealed that a basic patch on the surface of TNRC18(BAH) (Extended Data Fig. 6b, right) corresponds to a region in Sir3(BAH) involved in binding to the acidic patch of NCP (Extended Data Fig. 6f), which indicated the presence of additional NCP-binding sites in TNRC18(BAH).Indeed, biolayer interferometry assays showed that TNRC18(BAH) binds to NCP containing a methyllysine analogue of histone H3K9 (H3Kc9me3) with a K d of 18.8 nM, much tighter than the H3K9me3 peptide alone in vitro (Extended Data Fig. 6g,h).Together, these data point to a strong binding affinity between TNRC18 and NCP.
TEs can serve as gene regulatory elements following activation 17,18 .We examined expression of the genes in proximity to TNRC18-bound sites.The overall expression of genes associated with all TNRC18 peaks or those with TNRC18-bound transcription start site (TSS) regions did not show significant changes in TNRC18(BAH)-mutant cells compared with WT cells (Extended Data Fig. 7d).By contrast, those associated with the TNRC18-bound LTRs exhibited significant overall activation (Extended Data Fig. 7e).At genomic regions harbouring LTRs bound by both H3K9me3 and TNRC18, the genes upregulated in TNRC18(BAH)-mutant cells relative to WT cells showed closer proximity to the nearby TNRC18 peaks than those downregulated or without expression change (with an average of 4.81 kb, 10.26 kb and 26.32 kb for the upregulated, downregulated or stably expressed genes,

Article
respectively, from a nearby TNRC18 peak; Extended Data Fig. 7f).A similar gene regulatory mode was reported for ERVs activated after TRIM28 loss 39 .Collectively, these results show that TNRC18 silences ERV1, which influences the expression of nearby genes in a BAH-dependent and H3K9me3-dependent manner.

TNRC18 silences transcription from TEs
Treatment with DNA methyltransferase and histone deacetylase (HDAC) inhibitors can induce genome-wide activation of treatment-induced non-annotated TSSs (TINATs), over 80% of which overlapped TEs, with the LTR12 family most enriched 40 .Our RT-qPCR-based results showed representative TINATs, including those in CRADD, DNAH12 and FMN1, were all activated in TNRC18(W2858A)-mutant cells compared with their expression in WT cells (Extended Data Fig. 7g).To confirm that TINAT activation occurs at a genome-wide level, we conducted cap analysis of gene expression followed by sequencing (CAGE-seq) 41 (Extended Data Fig. 7h).CAGE-seq accurately identifies TSSs and distinguishes multiple transcription isoforms 41 .Consistent with the RNA-seq-based analyses, CAGE-seq demonstrated that transcripts from the LTR12 family TEs were globally activated in TNRC18(W2858A)-mutant cells compared with WT cells (Fig. 4d).For example, the DHRS2, ACSBG1, ETV7 and SYNGR1 transcripts that have LTR12 as TSSs were all upregulated in the TNRC18(W2858A)-mutant cells compared with control cells (Fig. 4e and Extended Data Fig. 7i).Another example is the TSSs of TNFRSF10B, which contain a canonical TSS and a cryptic one residing in Tnrc18 W2745A/Y2747A mutation, among newly born pups (P0) from breeding of Tnrc18 W2745A/Y2747A heterozygous mice.n, cohort size.h, RNA-seq revealed the indicated TEs that exhibited expression changes in primary lung tissues from mice with a Tnrc18 W2745A/Y2747A mutation (n = 3 mice with two technical replicates each) compared with those of WT littermates (n = 2 mice with two technical replicates each).Significance cut-off is the same as in a, except that the expression fold change is set to 2. Adjusted P value was calculated using negative binomial model-based methods (DESeq2).i, GSEA revealed pathway enrichment in lungs from mice with a Tnrc18 W2745A/Y2747A mutation versus WT.For b and i, immunity-related gene sets are labelled in red.
a nearby LTR12C locus 42 , and HDAC inhibitor treatment activated the latter TSS 42 .In TNRC18(W2858A)-mutant cells, TNFRSF10B transcription was activated from the LTR12C-associated cryptic TSS, without having an obvious effect on that from canonical TSS (Fig. 4f).Additionally, the TNRC18 W2858A mutation activated the transcription of LTR12C located at noncoding RNA TSSs such as lncRNA11-11N5.1 and lncRNA11-299H22.1 (Extended Data Fig. 7j).Collectively, we revealed a BAH-dependent repressive role of TNRC18 in the transcription of LTRs and LTR-regulated nearby genes.

Tnrc18 mutation causes death in mice
Activation of ERVs, interferon-stimulated genes and cryptic transcripts in TNRC18(BAH)-mutant cells implied a wider functionality for TNRC18.To further evaluate the functional relevance of TNRC18(BAH) in development, we used a CRISPR-Cas9-based approach to produce heterozygous mice harbouring an H3K9me3-binding-defective mutation of Tnrc18 BAH (mouse Tnrc18 W2745A/Y2747A , equivalent to human TNRC18 W2858A/Y2860A ; Extended Data Fig. 8a).Breeding of Tnrc18 W2745A/Y2747A heterozygous mice produced homozygous mutant pups at birth at a rate of 8% (13 out of a total of 161), which was significantly lower than the expected frequency (Fig. 4g).Additionally, the surviving homozygous mutant mice, both males and females, exhibited a dwarfism phenotype compared with their WT or heterozygous littermates (Extended Data Fig. 8b).
To further assess the function of TNRC18 in mouse tissues, we isolated five different primary tissues (lung, liver, heart, brain and stomach) from WT mice and paired littermates carrying the Tnrc18 BAH homozygous mutation.RNA-seq demonstrated strong ERV activation in the lung and liver, less so in the heart and stomach, and no significant change in the brain of Tnrc18 BAH mutant mice compared with that of WT mice (Fig. 4h, Extended Data Fig. 8c and Supplementary Table 4).Most reactivated TEs were known targets of SETDB1.RNA-seq also confirmed that the same immunity-related pathways were substantially activated in the lung and liver from Tnrc18-mutated compared with WT mice (Fig. 4i, Extended Data Fig. 8d,e and Supplementary Table 4).We next established primary mouse embryonic fibroblasts (MEFs) from WT and Tnrc18 BAH homozygous mutant embryos (Extended Data Fig. 8f).RNA-seq analyses showed that ERVs were significantly activated in Tnrc18 BAH mutant MEFs compared with WT MEFs (Extended Data Fig. 8g and Supplementary Table 4).Concurrent with ERV derepression was activation of interferon-stimulated and immunity-related genes (Extended Data Fig. 8h and Supplementary Table 4).These observations suggest that the loss-of-function mutation of Tnrc18 BAH has widespread effects on deregulating TEs and genes in multiple tissues and organs during development, which may explain the embryonic death and dwarfism seen with Tnrc18-mutated mice.
The Sin3-HDAC complex binds partner proteins through interactions between the paired amphipathic helix (PAH) domain of Sin3 and the Sin3-interacting motif of the partner (enriched for a consensus LXXLL-like sequence, with X indicating any amino acid) 46 .On the basis of sequence prediction, we synthesized peptides for each of the potential Sin3-interacting motifs within TNRC18 (Supplementary Table 7) and used ITC to assess their potential binding to the PAH domain of Sin3.Among these, amino acids 718-792 of TNRC18 (Fig. 5c left, and Extended Data Fig. 9c; named as a Sin3-interaction motif of TNRC18) bound to the PAH domain of Sin3 with a K d of 104.5 μM.By contrast, two independent mutations of this fragment largely abolished its binding to the PAH of the Sin3 domain (Fig. 5c, right).Next, we focused on TNRC18(L760), a residue located at the centre of the Sin3-interaction motif (Extended Data Fig. 9c) and crucial for efficient binding to the PAH domain of Sin3 (Fig. 5c, right).Here we used the CRISPR-Cas9 technique to introduce a TNRC18 L760A mutation into HEK293 cells (Extended Data Fig. 9d).RNA-seq profiling of TNRC18(L760A)-mutant cells and TNRC18 WT cells showed reactivation of ERV family TEs in the mutant cells (Extended Data Fig. 9e and Supplementary Table 6).RT-qPCR analyses showed that there was approximately 1.5-2-fold activation of LTR12 and LTR7 family TEs (Extended Data Fig. 9f).Such TE reactivation resembles what was seen with TNRC18 KO or its BAH domain mutation, although the degree of TE reactivation in the Sin3-binding mutant (L760A) seemed milder.Nevertheless, the same immunity-related gene sets were activated in TNRC18(L760A)-mutant cells compared with TNRC18 WT cells, as seen with TNRC18 KO (Extended Data Fig. 9g,h and Supplementary Table 6).Collectively, these results establish a crucial requirement for TNRC18 and the associated Sin3-HDAC complex for mediating ERV repression.
Consistent with the co-immunofluorescence results (Fig. 5b and Extended Data Fig. 9b), CUT&RUN for HDAC, TRIM28 and SETDB1 showed that all of them colocalized well with TNRC18-bound genomic regions (Fig. 5d and Extended Data Fig. 9i), as exemplified by their co-binding at ITGB3BP, ZNF180 and ZNF260 (Supplementary Fig. 9j).Of note, motif analysis of TNRC18 CUT&RUN peaks also revealed the DNA-binding motifs of the TRIM28-associated KRAB-ZnF proteins such as ZFP809 (ref.47) (Extended Data Fig. 9k).These observations not only point to functional cooperation between TRIM28 and TNRC18 but also indicate the potential involvement of DNA-binding ZNFs in mediating and/or stabilizing binding of TNRC18 to genomic targets.SETDB1 KO in HEK293 cells (Extended Data Fig. 10a) led to a partial but significant decrease in TNRC18 levels in these cells compared with WT cells (Extended Data Fig. 10b, left; see anti-TNRC18).This effect is reminiscent of the observed decreased H3K9me3 and H3K9me2 regulator levels following depletion of the associated partner Rif1 in stem cells 48 .As expected, CUT&RUN showed a global decrease in the genomic binding of TNRC18 in SETDB1-depleted cells compared with WT cells (Extended Data Fig. 10c).Although the underlying mechanism of this effect remains unclear, the observed partial loss-of-function of TNRC18 after SETDB1 loss further supports a functional link between the two.
Next, we sought to determine whether TNRC18 functions to recruit co-repressors to genomic sites such as ERVs for silencing.Towards this end, we conducted CUT&RUN for TNRC18, ERV co-repressors (HDAC2, TRIM28 and SETDB1) and histone modifications (H3K9me3 and H3 acetylation (H3ac)) in WT and TNRC18 W2858A -mutated HEK293 cells.Genome-wide binding of TNRC18 was significantly decreased in the TNRC18 W2858A -mutated cells relative to WT cells (Fig. 5e, red).There were concurrent decreases in binding by three examined co-repressors (HDAC2, TRIM28 and SETDB1) at TNRC18-bound genomic sites in the TNRC18-mutant cells compared with WT cells (Fig. 5e).The total SETDB1 protein level was unaffected following TNRC18 KO or mutation to its BAH domain (Extended Data Fig. 10b, right).Similarly, global H3K9me3 and H3ac showed no changes, as assessed by immunoblotting and CUT&RUN (Extended Data Fig. 10d-f).However, H3ac levels at LTR12 subclasses (LTR12C, LTR12D and LTR12E), but not other TEs (such as LINEs and SINEs), were increased in TNRC18 W2858A -mutated cells compared with WT cells.By contrast, H3K9me3 levels at these LTR12s and Article not LINEs or SINEs were decreased (Fig. 5f), as exemplified in what was seen at the LTR12C locus on chromosome 14 (Fig. 5g).Whole-genome bisulfite sequencing showed that the TNRC18 BAH mutation did not change DNA methylation levels, either globally (Extended Data Fig. 10g) or at LTR12 and LTR7 families (Extended Data Fig. 10h).Thus, the TNRC18 BAH mutation seems to mainly affect the histone modification status (relative levels of H3K9me3 compared with H3ac), instead of DNA methylation, at ERVs.To further validate the observed repressive function of TNRC18, we used the GAL4 DNA-binding domain (DBD)-fusion-based reporter assay (Extended Data Fig. 10i).Relative to mock, TNRC18 induced a significant transcriptional repression effect (Extended Data Fig. 10i).In agreement with the CUT&RUN results, chromatin immunoprecipitation with qPCR (ChIP-qPCR) revealed that concurrent with the expected increase in TNRC18 binding to the GAL4 target sites, the binding levels of HDAC2, TRIM28 and SETDB1 were all significantly increased, whereas those of H3Kac were decreased (Fig. 5h).
Together, these results support a model in which H3K9me3 recognition by TNRC18-potentially synergistic with the engagement of DNA-binding factors (for example, ZNFs)-help recruit and/or tether TNRC18 and its associated complex onto target sites (such as ERVs) to maintain a repressed chromatin state.

Discussion
This study reports that TNRC18(BAH) is a mammalian H3K9me3 reader, and our biochemical and structural characterization demonstrated that such specificity to H3K9me3 engagement partially stems from various interactions of TNRC18(BAH) with histone backbone sequences adjacent to H3K9me3, such as H3T6 and H3S10.Genome-wide profiling of TNRC18 showed enriched binding to ERVs, notably the ERV1 family TEs such as LTR12 and LTR7.Both TNRC18 loss and an H3K9me3-binding-defective mutation of TNRC18(BAH) led to significantly activated expression of ERV1, inducing viral mimicry.Preferential silencing effects by TNRC18 on ERVs are in contrast to those induced by the HUSH complex on LINE1s 3,4 .This result indicates that there is a division of labour among various H3K9me3 reader complexes in ensuring repression of various TE families.Significant enrichment in CCAAT motifs at TNRC18-bound sites, as well as ZNFs identified through TNRC18 BioID, suggest that additional DNA sequence-specific factors (in addition to chromatin-associating factors) are likely to mediate genomic recruitment of TNRC18 (for details, see Supplementary Note 1).These observations are consistent with the notion of an 'arms race' in which various epigenetic machineries and associated DNA-binding ZNFs are deployed to silence and 'tame' viral insertions during evolution 17,18,49,50 .TNRC18 depletion, or mutagenesis of the H3K9me3-binding TNRC18(BAH) domain, in a range of different model systems resulted in activation of LTR12s, LTR12-associated cryptic TSSs and immune-related genes.Moreover, mice harbouring an H3K9me3-binding-defective mutation in TNRC18(BAH) exhibited severe phenotypes of neonatal death and dwarfism in adult surviving animals.Overall, this study demonstrates that a TNRC18(BAH)-based H3K9me3-sensing pathway operates to modulate the landscape of promoter-enhancer activities by controlling TEs and the embedded cis-elements, and that TNRC18 is a crucial regulator of immunity and development.

Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41586-023-06688-z.

Plasmids and antibodies
The sgRNAs targeting representative ERV loci were cloned into pLentiGuidePuro (Addgene, 52963).The pLKO.1-puro-based shRNA plasmids for KD of TNRC18 were purchased from Sigma.Detailed information of sgRNA and shRNA is provided in Supplementary Table 7. Full-length TNRC18 cDNA was synthesized (Wuxi Qinglan Biotech), with synonymous mutations introduced to confer shRNA resistance or to create unique internal restriction enzyme sites for facilitating mutagenesis of TNRC18(BAH).The synthesized full-length TNRC18 cDNA was first ligated into a PiggyBac Transposase Expression vector, PB-EF1α-MCS-IRES-Neo (System Biosciences, PB533A-2).In-frame fusion of a GFP tag (which was used for immunofluorescence (IF), immunoblotting and CUT&RUN), BirA* (used for BioID) or the DBD of GAL4 (GAL4-DBD) with TNRC18 was conducted by ligating the corresponding PCR product into a unique restriction enzyme site flanking either the N or C terminus of TNRC18 in the above PiggyBac-EF1α expression plasmid.The design of the TNRC18 mammalian expression vector is illustrated in Extended Data Fig. 1c.The pcDNA3.1-basedmammalian expression plasmid for TNRC18 was created by subcloning.All of the used plasmids were verified by Sanger sequencing.The antibodies used for immunoblotting, IF, co-IP and CUT&RUN are listed in Supplementary Table 7.

IF and colocalization analysis
Cells were grown on polylysine-coated coverslips (Corning, 354085) overnight in a 37 °C incubator.The cover slips were washed with PBS and then fixed in 4% formaldehyde for 10 min at room temperature.Fixed cell samples were washed with cold PBS three times and incubated in PBS plus 0.1% Triton X-100 for 10 min, followed by washing with PBS three times and incubation in blocking buffer (1% BSA in PBS plus 0.1% Tween-20) for 30 min.After removal of the blocking buffer, the fixed samples were then incubated with diluted primary antibody (anti-TNRC18 1:1,000, anti-Flag 1:1,000, anti-GFP 1:1,000; see the Nature Portfolio Reporting Summary for detailed information) in blocking buffer overnight at 4 °C in a humidified chamber, and washed with PBS plus 0.1% Tween-20 three times (3 min each time).Last, the samples were incubated with secondary antibody (diluted at 1:2,000) conjugated to appropriate fluorophore for 1 h at room temperature and washed three times with PBS before adding the mounting medium (Thermo Scientific, P36935).The slides were then dried overnight in the dark before imaging on an Olympus FV1000 confocal microscope with a ×100/1.4NA plan apochromat oil-immersion objective.
Colocalization analysis was performed using the EzColocalization plugin in Fiji (v.1.53) 52.Colocalization was measured using the Pearson correlation coefficient as previously described 53,54 .For analysis, nuclei were manually segmented by tracing with the polygon selection tool, and then converted into binary masks used in the EzColocalization plugin to restrict colocalization analysis to the nuclei.

Immunoblotting
Samples of total protein were prepared using lysis buffer (50 mM Tris-HCl (pH 8.0), 150 mM NaCl and 1% IGEPAL CA-630 detergent), followed by brief sonication and centrifugation.The same amount of extracted sample was loaded onto a SDS-PAGE gel for immunoblot analysis.All antibodies for immunoblotting were diluted 1:1,000.Detailed information for antibodies is provided in the Nature Portfolio Reporting Summary.Uncropped scans of blots are shown in Supplementary Fig. 1.

IP
Samples of total protein sample were prepared as described above, followed by brief sonication and high-speed centrifugation.Extracted samples were incubated with primary antibodies (2 μg) on a rotator overnight at 4 °C.Dynabeads (Invitrogen) were added and incubated for an additional 2 h.Beads were washed three times in lysis buffer, resuspended in 50 μl of 2× sample loading buffer and boiled at 90 °C for 5 min before loading onto a SDS-PAGE gel.Western blotting was performed using standard protocols with a PVDF membrane, and signals were visualized using an ImageQuant LAS 4000 Luminescent Image Analyzer (GE Healthcare).Repeated immunoblotting using either freshly prepared total protein samples or those from anti-TNRC18 IP detected full-length TNRC18 and the two TNRC18 protein cleavage products (with sizes of about 220 and 80 kDa) that interact with each another (data not shown), reminiscent to what was reported for MLL (also known as KMT2A) 55 .

RT-qPCR
Total RNA (1 μg) was extracted from cells or tissues using a RNeasy Mini kit (Qiagen, 74106; with on-column digestion for 30 min by DNase set, Qiagen, 79254), and RT was carried out using a High Capacity cDNA Reverse Transcription kit (Applied Biosystems, 4368814) according to the provider's protocols.qPCR was run in triplicate using SYBR green master mix reagent (Bio-Rad, 1725125) on an ABI 7900HT fast real-time PCR system (ABI).Data from at least three independent experiments are presented as the mean ± s.d. after normalization against internal control signals.Information on primers used for RT-qPCR is provided in Supplementary Table 7.

CasID
Cells with stable expression of the dCas9-BirA* fusion protein were transfected with pooled plasmids of pLentiGuidePuro that contain a set of sgRNAs targeting the representative ERV loci (for details, see Supplementary Table 7) or empty vector (which served as the control group).After treatment with 50 μM biotin for 24 h, cells were collected from two 15-cm plates and then washed twice with cold PBS.Cell pellets were resuspended in 1 ml RIPA lysis buffer (10% glycerol, 25 mM Tris-HCl pH 8.0, 150 mM NaCl, 2 mM EDTA, 0.1% SDS, 1% NP-40 and 0.2% sodium deoxycholate), and 1 μl benzonase (Sigma-Aldrich, E1014) was added to the lysate and incubated on ice for 1 h.After centrifugation at maximum speed for 30 min at 4 °C, the supernatant was collected and incubated with Neutravidin beads (Thermo Fisher, 29204) overnight at 4 °C.The Neutravidin beads were washed twice with RIPA buffer and TAP lysis buffer (10% glycerol, 350 mM NaCl, 2 mM EDTA, 0.1% NP-40 and 50 mM HEPES, pH 8.0) sequentially.Finally, the beads were washed three times with ABC buffer (50 mM ammonium bicarbonate, pH 8.0) and subjected to mass-spectrometry (MS)-based analysis.We focused on the top 300 hits showing the most significant enrichment in CasID versus control, which contained many of the previously identified TE regulators based on reporter-based assays 23,[26][27][28] , and suggested that our CasID approach works well.We scrutinized top hits by annotating their domain composition using the Simple Modular Architecture Research Tool (http://smart.embl-heidelberg.de/), which showed that TNRC18, an under-studied protein, contains the putative chromatin-binding domain, namely, BAH studied in this work.

BioID
A BirA* cDNA fragment was inserted in-frame into the PB-EF1α-based TNRC18 expression plasmid, followed by the establishment of stable expression lines in HEK293 and HeLa cells.BioID was performed as previously described 56,57 .In brief, cells, cultured in two 15-cm plates, treated with 50 μM biotin for 24 h and then washed twice with cold PBS.Cell pellets were resuspended in 1 ml RIPA lysis buffer and 1 μl benzonase (Sigma-Aldrich, E1014) was added to the lysate followed by incubation on ice for 1 h.After centrifugation at top speed for 30 min at 4 °C, the supernatant was collected and incubated with Neutravidin beads (Thermo Fisher, 29204) overnight at 4 °C.The Neutravidin beads were washed twice with RIPA buffer and TAP lysis buffer sequentially.Finally, the beads were washed three times with ABC buffer and subjected to MS-based analysis.The identified BioID hits were further analysed using STRING 58 .

MS-based protein identification
Proteins were eluted from beads by adding 50 μl 2× Laemmli buffer and heating at 95 °C for 5 min.A total of 50 μl of each sample was resolved by SDS-PAGE using a 4-20% Tris-glycine wedge well gel (Invitrogen) and visualized by Coomassie staining.Each SDS-PAGE gel lane was sectioned into 12 segments of equal volume.Each segment was subjected to in-gel trypsin digestion as follows.Gel slices were destained in 50% methanol, 50 mM ammonium bicarbonate (Sigma-Aldrich), followed by reduction in 10 mM Tris [2-carboxyethyl] phosphine and alkylation in 50 mM iodoacetamide.Gel slices were then dehydrated in acetonitrile, followed by the addition of 100 ng porcine sequencing-grade modified trypsin (Promega) in 50 mM ammonium bicarbonate and incubation at 37 °C for 12-16 h.Peptide products were then acidified in 0.1% formic acid.Tryptic peptides were separated by reverse-phase XSelect CSH C18 2.5 μm resin (Waters) on an in-line 150 × 0.075 mm column using a nanoAcquity UPLC system (Waters).Peptides were eluted using a 60 min gradient from 98:2 to 65:35 buffer A:B ratio (buffer A, 0.1% formic acid, 0.5% acetonitrile; buffer B, 0.1% formic acid, 99.9% acetonitrile).Eluted peptides were ionized by electrospray (2.2 kV) followed by MS/MS analysis using higher-energy collisional dissociation (HCD) on an Orbitrap Fusion Tribrid mass spectrometer (Thermo) in top-speed data-dependent mode.MS data were acquired using the FTMS analyser in profile mode at a resolution of 240,000 over a range of 375 to 1,500 m/z.Following HCD activation, MS/MS data were acquired using the ion trap analyser in centroid mode and normal mass range with precursor mass-dependent normalized collision energy between 28.0 and 31.0.Proteins were identified by searching the UniProtKB database restricted to Homo sapiens using Mascot (Matrix Science) with a parent ion tolerance of 3 ppm and a fragment ion tolerance of 0.5 Da, fixed modifications for carbamidomethyl of cysteine, and variable modifications for oxidation on methionine and acetyl on the N terminus.Scaffold (Proteome Software) was used to verify MS/MS-based peptide and protein identifications.Peptide identifications were accepted if they could be established with less than 1.0% false discovery by the Scaffold Local false discovery rate algorithm.Protein identifications were accepted if they could be established with less than 1.0% false discovery and contained at least two identified peptides.Protein probabilities were assigned using the Protein Prophet algorithm 59 .Proteins were filtered out if they had a spectral count <8 in all sample groups and the counts were normalized to log 2 NSAF values.

Protein expression and purification
DNA encoding the BAH domain of human TNRC18 (amino acids 2785-2968, TNRC18(BAH)) was inserted into a modified pRSF-Duet vector preceded by an N-terminal hexahistidine (His 6 )-SUMO tag and a ubiquitin-like protease 1 (ULP1) cleavage site.BL21(DE3) RIL cells harbouring the expression plasmids were induced through the addition of 0.4 mM isopropyl β-d-1-thiogalactopyranoside (IPTG) when the cell density reached an optical density of 0.8 at 600 nm and continued to grow at 16 °C overnight.Cells were collected and lysed in buffer containing 50 mM Tris-HCl (pH 8.0), 1 M NaCl, 25 mM imidazole, 10% glycerol and 1 mM PMSF.Subsequently, the fusion protein was purified through a nickel column, followed by removal of the His 6 -SUMO tag by ULP1 cleavage, ion-exchange chromatography on a heparin column (GE Healthcare) and size-exclusion chromatography on a HiLoad 16/600 Superdex 75 pg column (GE Healthcare).The purified protein samples were concentrated in 20 mM Tris-HCl (pH 7.5), 100 mM NaCl, 5% glycerol and 5 mM DTT, and stored at −80 °C.The constructs for various TNRC18(BAH) domain mutants were produced by site-directed mutagenesis, and purification of these mutant proteins was conducted in the same way as described above.

Crystallization and data collection
For crystallization, 1.5 mM TNRC18(BAH) was mixed with the H3K9me3-containing peptide, comprising residues 1-22 of histone H3 and a C-terminal tryptophan (for the purpose of peptide quantification), in a molar ratio of 1:3.Subsequently, the protein complex was incubated with 0.1 M sodium cacodylate (pH 6.5), 0.2 M magnesium acetate and 16-20% PEG8000 using the hanging drop vapour-diffusion method at 4 °C.The crystals were soaked in the well solution supplemented with 25% (v/v) glycerol before being flash-frozen in liquid nitrogen.X-ray diffraction data were collected on beamline 24-ID-E at Advanced Photon Source (APS), Argonne National Laboratory.The datasets were processed using the HKL3000 program (v.721.3) 60.The structure of the TNRC18(BAH)-H3K9me3 complex was solved by molecular replacement with PHASER (v.2.7.16) 61 using the structure of mouse ORC1(BAH) (PDB identifier 4DOV) as a search model.Iterative cycles of model rebuilding and refinement were carried out using COOT (WinCoot 0.8.9 EL) 62 and PHENIX (V.1.20.1_4487) 63, respectively.Structural figures were generated using PyMol (v.2.5.2).Statistics for data processing and structure refinement are summarized in Extended Data Table 1.

ITC
WT or mutant TNRC18(BAH) protein and peptides were respectively dialysed against buffer containing 25 mM Tris-HCl (pH 7.5), 100 mM NaCl and 1 mM DTT at 4 °C overnight.For ITC titration, the syringe and well contained 2 mM peptide and 0.2 mM TNRC18(BAH), respectively.A MicroCal iTC200 system (GE Healthcare) was used to conduct the ITC measurements.A total of 15-17 injections with a spacing of 180 s and a reference power of 5 μcal s -1 were performed at 7 °C.The ITC curves were processed using the software ORIGIN (v.7.0) (MicroCal) with one-site fitting model as previously described 36,64 .

Preparation of chemically modified nucleosome
The NCP carrying a mimic of H3K9me3 modification (H3Kc9me3) was prepared using previously reported protocols 65,66 .In essence, histones (H2A, H2B, K9C/C110S-mutated H3 and H4) from Xenopus laevis were expressed in Escherichia coli BL21 (DE3) RIL and purified through ion-exchange chromatography using sequential Q-XL and SP-HP (GE Healthcare) columns in a denaturing buffer (20 mM Tris-HCl pH 7.5, 7 M urea and 1 mM β-mercaptoethanol) and a salt gradient of 0 to 1 M KCl.Purified histone proteins were thoroughly dialysed against distilled water containing 2 mM β-mercaptoethanol, lyophilized and stored at −80 °C.The K9C-mutated H3 protein was alkylated as previously described 67 .Subsequently, the histone octamer was reconstituted in refolding buffer (20 mM Tris-HCl pH 7.5, 2 M NaCl, 1 mM EDTA and CCCGGTGCCGAGGCCGCTCAATTGGTCGTAGACAGCTCTAGCACCGC TTAAACGCACGTACGCGCTGTCCCCCGCGTTTTAACCGCCAAGGGGA TTACTCCCTAGTCTCCAGGCACGTGTCAGATATATACATCCGAT; Widom 601 DNA sequence is underlined) was PCR amplified.The NCP was assembled using the purified histone octamer mixed with the Wisdom 601 DNA in a 1:1 molar ratio, followed by step-wise dialysis against a buffer containing 10 mM Tris-HCl (pH 7.5), 1 mM EDTA, 1 mM DTT and 0.25-2 M KCl.Finally, the assembled NCP was dialysed against a low-salt buffer containing 10 mM Tris-HCl (pH 7.5) and 1 mM DTT overnight.Biotinylated 601 DNA or biotinylated/H3Kc9me3-modified NCP were analysed in 5% Tris-borate-EDTA (TBE) native gel in 0.2× TBE buffer (89 mM TBE, pH 8.3) at 4 °C and ran at 100 V for about 30 min.The DNA band was visualized using SYBR Gold staining and scanned using a ChemiDoc imager (Bio-Rad).

Biolayer interferometry
The binding affinity of TNRC18(BAH) for H3Kc9me3-modified NCP was measured at 30 °C by biolayer interferometry on a gator prime instrument (Gatorbio) at a shaking speed of 1,000 r.p.m. using purified TNRC18(BAH) and H3Kc9me3-lablled NCP as described above.The binding buffer contained 20 mM Tris-HCl (pH 7.5), 100 mM NaCl, 1 mM DTT and 0.01% Tween-20.Before the experiments, all biosensors were pre-equilibrated in the binding buffer for 10 min.Assembled nucleosome was diluted to 10 nM to be immobilized on streptavidin biosensors (Gatorbio) to a shift of around 1.0.After a 120-s baseline step in binding buffer, the ligand-loaded biosensors were submerged in various concentrations (2-fold serial dilution from 50 nM to 3.1 nM) of TNRC18(BAH) protein solutions for 120 s and then transferred into the binding buffer for 600 s to measure TNRC18(BAH) association and dissociation kinetics.Data were aligned, inter-step corrected to the association step and further analysed using the Gatorbio data analysis software.The data were fitted using 1:1 mode and results were plotted in GraphPad Prism (v.9.1.0).Experiments were repeated two times to ensure consistency.

CRISPR-Cas9-based genome editing for KO or site-specific mutation of TNRC18
For KO of TNRC18 in cells, we used the Alt-R CRISPR-Cas9 system in combination with a pair of targeting crRNAs.For introduction of a point mutation in TNRC18 in cells, we used the same Alt-R CRISPR-Cas9 system in combination with one crRNA and one single-stranded oligodeoxynucleotide donor (ssODN).The Alt-R CRISPR-Cas9 system includes a trans-activating CRISPR RNA (tracrRNA) with ATTO 550, a crRNA targeting the site to be mutated (the sequence information of crRNA is listed in Supplementary Table 7), S.p. HiFi Cas9 nuclease V3 and an electroporation enhancer (all purchased from IDT).The ssODN was custom designed (the sequence information of the ssODN is listed in Supplementary Table 7; https://benchling.com/crispr/) and then ordered from commercial sources (IDT).Preparation and delivery of a CRISPR-Cas9 ribonucleoprotein (RNP) complex (Alt-R CRISPR-Cas9 crRNA-tracrRNA and S.p. HiFi Cas9 nuclease), electroporation enhancer and ssODN into cells were performed according to the manufacturer's guidelines.In brief, crRNA and tracrRNA were mixed in equimolar concentrations, followed by heating at 95 °C for 5 min.The mixture was cooled to room temperature (15-25 °C) on the bench top, allowing the crRNA-tracrRNA duplex to form.The RNP complex was made by diluting the crRNA-tracrRNA duplex and Cas9 enzyme components in PBS, followed by incubation at room temperature for 10-20 min.The RNP complex mixed with the electroporation enhancer and ssODN (for KO, no ssODN is required) was delivered to cells using electroporation with an Amaxa Cell Line Nucleofector kit (Lonza).At 18 h after electroporation, ATTO-550-positive cells were sorted by FACS (UNC flow core) and split into 96-well plates to establish clonal lines for PCR and sequencing-based genotyping.After genotyping, lines with homozygous mutation/deletion were further validated at both DNA and RNA levels by Sanger sequencing of PCR products that covered the mutation/deletion site.

RNA-seq and data analysis
Total RNA was extracted using a RNeasy Mini kit (Qiagen, 74106) (digested on column for 0.5 h by DNase set, Qiagen, 9254) and then prepared using a NEBNext rRNA Depletion kit (NEB, E6310L) and a NEB Ultra II DNA Library Prep kit (NEB, E7103) per the manufacturer's instructions 57,68 .The multiplexed RNA-seq libraries were subjected to deep sequencing using an Illumina NextSeq500 or NovaSeq 6000 sequencer platform.FastQC was used for quality control of high-throughput raw sequencing data.
To comprehensively quantify expression by all repetitive elements, including TEs in RNA-seq data, we used REdiscoverTE (v.1.0.1) 74 , which annotates the genes defined in Gencode and all RepeatMasker sequences in the human genome down to the TE subfamily level after the repetitive element expression quantification by salmon.TE count matrix was appended to gene expression data matrix for differential expression analysis using DESeq2 (v.1.38.2) 71 , and TEs showing significant differential expression were defined as having adjusted P values of less than 0.05 and a log 2 value of fold change greater than 1 in the experimental versus control group (with a base mean value greater than 10).
Because REdiscoverTE is specific to human, we used SalmonTE (v.0.4) 75 to calculate the TE abundance in the mouse samples.By performing read mapping against the TE sequences, SalmonTE reassigns multi-mapping reads using the expectation-maximization algorithm before determining the read count of each sequence.From the counts table, the differentially expressed TEs were identified and the statistical analysis function of SalmonTE generated the summary of each class and clade through the generalized linear model.

CAGE-seq and data analysis
CAGE-seq library preparation, sequencing, mapping, gene expression, motif discovery analysis and gene ontology (GO) enrichment analysis were performed by DNAFORM (Kanagawa, Japan).Qualities of total RNA were assessed using Bioanalyzer (Agilent) to ensure that the RNA integrity number was greater than 7.0.The cDNAs were synthesized from total RNA using random primers.The ribose diols in the 5′ cap structures of RNAs were oxidized and then biotinylated.The biotinylated RNA/cDNAs were selected using streptavidin beads (cap-trapping).After RNA digestion using RNaseONE/H and adaptor ligation to both ends of cDNA, double-stranded cDNA libraries (CAGE libraries) were constructed.CAGE libraries were sequenced using single-end reads of 75 nucleotides on a NextSeq 500 instrument (Illumina).Obtained reads (CAGE tags) were mapped to the human hg19 genome using BWA (v.0.7.17).Unmapped reads were then mapped using HISAT2 (v.2.0.5).CAGE tag clustering, detection of differential expressed genes and motif discovery were performed by pipeline RECLU (v.1.0) 76.Tag count data were clustered using the modified Paraclu program.Clusters with count per million < 0.1 were discarded.Regions that had 90% overlap between replicates were extracted using BEDtools (v.2.12.0).The clusters with an irreproducible discovery rate ≥ 0.1 and clusters longer than 200 bp were discarded.Differentially expressed genes were detected using the edgeR package (v.3.22.5).TEs were annotated using the RepeatMasker bed file from the UCSC table browser.Read counts were normalized by library size, and enrichment scores for each TE family were computed as the normalized count ratio between mutant and WT samples.
CUT&RUN and data analysis CUT&RUN 77 was performed as previously described 78,79 with a commercially available kit according to the manufacturer's instruction (EpiCypher CUTANA pAG-MNase for ChIC/CUT&RUN, 15-1116).In brief, 1 million cells were first collected, washed in CUT&RUN wash buffer (20 mM HEPES, pH 7.5, 150 mM NaCl, 0.5 mM spermidine and 1× complete protease inhibitor cocktail), and then bound to activated ConA beads (Bangs Laboratories, BP531).Next, the cell-bead sample was incubated with antibodies against the protein target (all antibodies were added at 1:100 dilution, also refer to the Nature Portfolio Reporting Summary for detailed information) and then permeabilized in digitonin-containing buffer (CUT&RUN wash buffer plus 0.02% digitonin), which was then followed by washing in the digitonin buffer, incubation with pAG-MNase and another wash in the digitonin buffer to remove unbound pAG-MNase.After the final wash, cells were subjected to digestion (after pAG-MNase activation) through the addition of pAG-MNase digestion buffer (digitonin buffer plus 2 mM CaCl 2 ), followed by incubation on a nutator for 2 h at 4 °C.Solubilized chromatin was then released using CUT&RUN stop buffer (340 mM NaCl, 20 mM EDTA, 4 mM EGTA, 50 μg ml -1 RNase A, 50 μg ml -1 glycogen and 0.2 μg fly genomic DNA) and DNA purification was carried out using a PCR cleanup kit (NEB, Monarch PCR & DNA Cleanup kit, T1030).Next, 10 ng of the purified CUT&RUN DNA was used for preparation of multiplexed libraries with a NEB Ultra II DNA Library Prep kit per the manufacturer's instruction (NEB, E7103).Sequencing was conducted using an Illumina NextSeq 500 sequencing system.
Distribution of peaks was analysed using the R package ChIPpeakAnno (v.3.34.1) 82 (bindingType="fullRange", bindingRegion=c(−1,1), ignore.peak.strand=TRUE,select="bestOne").By using the default parameters of ChIPpeakAnno, 75% of the TNRC18 peaks called by MACS2 (7,242 of a total of 9,656) overlapped TEs (data not shown).However, two parameters when using ChIPpeakAnno needed to be further considered for annotation of peaks as TEs: (1) the feature of the reference TE itself and (2) the distance between the TE to peaks (in other words, coverage of a called peak by TE).For the first parameter, certain very short simple repeat motifs existed in the reference TEs, such as (AT)n, with n = 3.And peaks with such short simple motifs may be incorrectly called as TEs.As this work is focused mainly on long TEs, we only retained those repetitive elements or TE features with a size of no less than 50 bp in the reference (for LINEs, the size was no less than 15% of a full-length one, or 800 bp).For the second parameter, ChIP-peakAnno annotates peaks within 500 bp around TEs as TE peaks by default.In this study, we adopted a more stringent criterion by applying an overlap of no less than 50 bp for calling of TE peaks.In total, 32% of the TNRC18 peaks (3,090 of 9,656) were annotated as TEs by using the above stringent criterion (that is, the length of TE feature no less than 50 bp and peak TE overlap of no less than 50 bp).We conclude that a significant proportion of TNRC18 peaks are TEs, which can range from 32% (when using our custom-defined, stringent criteria) to 75% (by the default, relaxed parameters).To further annotate TE features, we chose to use the stringent criteria and focus on those TNRC18 peaks most confidently annotated as TEs by ChIPpeakAnno.Among them, around a half of them were further annotated as SINEs and 31% as ERVs (Extended Data Fig. 4b).The overall distribution of each TE subclass in the genome was considered and used for normalization in assessing its enrichment.
Motif analysis was performed using hypergeometric optimization of motif enrichment (HOMER) (v.4.8.2) 83 .Deeptools (v.3.2.0) was used to generate bigwig files, heatmaps and averaged plotting of CUT&RUN signals 84 .Genomic binding profiles were generated using the deepTools 'bamCoverage' and 'computeMatrix' functions, and blacklist regions were removed from coverage.The reads at repeated element regions, downloaded from the UCSC Genome Browser (repeatMasker Table ), were fractionally assigned using featureCounts (v.2.0.0) with the following parameters: --ignoreDup -p -d 30 -D 1200 -M -fraction 85 .These were then normalized to the sequencing depth.The log 2 ratio of TNRC18 mutation relative to WT or specific IP relative to IgG was calculated to generate box plots using R.

Whole genome-wide bisulfite sequencing and data analysis
Whole genome-wide bisulfate sequencing (WGBS) and data analyses were conducted as previously described 86 .In brief, total genomic DNA was purified and then used for the preparation of WGBS libraries and deep sequencing (performed by Admera Health).In brief, bisulfite treatment of genomic DNA was conducted using an EZ DNA Methylation-Gold kit (Zymo Research, D5005) per the vendor-provided instructions.Then, bisulfite-converted single-stranded DNA was recovered and used for library construction with an Accel-NGS@ Methyl-seq DNA Library kit (Swift BioSciences, 30024) based on the manufacturer's instructions.DNA enrichment was performed using PCR with primers compatible with Illumina sequencing.After quantity assessment and molecular size measurement, the WGBS libraries were pooled and sequenced using an Illumina NovaSeq S4 sequencer platform (150-bp read lengths in paired-end mode, with an output of approximately 750 million reads or 375 million in each direction per sample, and with 20% PhiX DNA added as spike-in controls).Read mapping and CpG calling were generated using the Bismark pipeline 87 .Then, the annotation of CpG sites were performed on the basis of UCSC Genome Browser (repeatMasker and CpG islands tracks).Data were filtered, normalized and calculated using the R packages 'methylKit' (v.1.26.0) (filterByCoverage(lo.count = 10, lo.perc = NULL, hi.count = NULL, hi.perc = 99.9),normalizeCoverage(met hod="median")).The violin plots of CpG methylation representing each site with the same annotation group were generated using the R package ggpubr.

CRISPR-Cas9-based Tnrc18 mutation KI in mice and phenotypic characterization
Tnrc18 KI mice bearing the W2745A/Y2747A mutation were generated in the Animal Models Core (AMC) facility affiliated to UNC School of Medicine by pronuclear microinjection of CRISPR-Cas9 reagents in mouse embryos with a previously described protocol 35 .Cas9 gRNAs targeting the mouse Tnrc18 W2745A/Y2747A -encoding codon were identified using Benchling software.Three gRNAs overlapping this region codon were selected for activity testing.Selected gRNAs were cloned into a T7 promoter vector followed by in vitro transcription and spin column purification.Functional testing was performed by transfecting a MEF cell line with gRNA and recombinant Cas9 protein (UNC Protein Expression and Purification Core Facility).Following transfection, the gRNA target site in Tnrc18 was PCR amplified from transfected cells and analysed using a T7endo1 assay to detect Cas9-mediated mutation.The guide RNA selected for genome editing in embryos was Tnrc18-sg85T (protospacer sequence 5′-GTGGTTCTACCATCCGG-3′).
Microinjected embryos were implanted in recipient pseudopregnant females.Five resulting pups were screened by PCR and sequencing for the presence of the mutant allele.The correct mutant allele was detected in two females born from microinjection.One positive founder was from the 25 ng μl -1 guide RNA mix, and the other founder was from the 10 ng μl -1 mix.We routinely examined around 8-10 most highly predicted off-target sites to ensure that they were unaffected in the founders.Female founders were mated to WT C57BL/6J males, but only one transmitted the mutant allele through the germline to offspring.To establish the mutant colony for further studies, F 1 animals heterozygous for the correct KI allele were back crossed with WT C57b/6 mice for more than six times to remove/segregate potential off-target mutation.Then, we used the back-crossed heterozygous mice for breeding with WT C57b/6 mice to maintain this line.Colony maintenance, daily care and monitoring of the lines were carried out in the collaboration with the Animal Studies Core, UNC.

Generation of primary MEF cells
The mice were set up for breeding, and embryonic day 13.5 embryos collected.The isolation and establishment of primary MEF cells were conducted as previously described 88 .In brief, unwanted organs of embryos were removed, followed by tissue homogenization and enzymatic dissociation of cells.Genotypes of cells were determined by PCR and Sanger sequencing.Cells were then cultured in DMEM base medium supplemented with 15% FBS for 25 passages to establish stable MEF cell lines.

Luciferase reporter assay
Cells seeded in 24-well plates were co-transfected with plasmids of GAL4-DBD-TNRC18, a luciferase reporter pGL2-5×GAL4-SV40 (Promega) and an internal control (pRL-CMV-Renilla).Forty-eight hours after transfection, luciferase activity was measured using a Dual Luciferase Reporter Assay system (Promega, E1910).Data were normalized to Renilla luciferase signals.The number of analyzed copies per transposon type: group 'unique mapping only' (LTR12 (n = 122 copies), LTR12C (n = 612 copies), LTR12D (n = 54 copies), HERV9-int (n = 58 copies)); group 'multi-mapping allowed' (LTR12 (n = 124 copies), LTR12C (n = 678 copies), LTR12D (n = 55 copies), HERV9-int (n = 60 copies)).Sample size of each box plot is also listed in Supplementary Table 8. d.Gene set enrichment analysis (GSEA) revealed enrichment for the indicated pathways in cells with TNRC18 KD (shTNRC18), compared to mock controls (shCtrl) (n = 2 biologically independent experiments).Immunity-related gene sets are labelled in red.The y-axis and x-axis showed the normalized enrichment score (NES) and false discovery rate (FDR) q-values of GSEA, respectively.e. GSEA revealed the positive correlation between activation of genes related to immunity and TNRC18 KD (shTNRC18) in HEK293 cells, compared to those with the scramble controls (shCtrl) or with the rescued re-expression of TNRC18 (shTNRC18_rescue).NES, normalized enrichment score.The P value was calculated by a two-sided empirical phenotype-based permutation test; the false discovery rate q-value is adjusted for gene set size and several hypotheses testing whereas the P value is not.f.GSEA revealing the positive correlation between activation of genes related to immunity and TNRC18 KO in HEK293 cells, compared to WT controls.The P value was calculated by a twosided empirical phenotype-based permutation test; the false discovery rate q-value is adjusted for gene set size and several hypotheses testing whereas the P value is not.Sample size of each box plot is listed in Supplementary Table 8. d.Heatmap of anti-GFP CUT&RUN signals in HEK293 cells stably expressed with GFP-TNRC18 (1 st column), or those for TNRC18, probed by antibody of endogenous TNRC18, in either parental HEK293 (2 nd column), HeLa (3 rd column) or K562 cells (4 th column), across ±5 kb from the most confident TNRC18 peaks defined in HEK293 cells (n = 7545; peaks common to GFP-TNRC18 [1 st column) and TNRC18 [2 nd column] in HEK293 cells were used).e. IGV tracks showing the CUT&RUN signal for TNRC18 in the indicated cells at the representative ERV (left), promoter-TSS (middle) or intergenic target site (right).TSS, transcription start site.f,g.Heatmap for the CUT&RUN signals of TNRC18 and H3K9me3 (probed with two independent antibodies), across ±5 kb from those TNRC18 peaks that were annotated as TEs (f) or LTRs (g).h.Motif search analysis revealing the most enriched motifs at the TNRC18 (top panel) and GFP-TNRC18 peaks (bottom panel) in HEK293 cells.Extended Data Fig. 6 | Structural basis of TNRC18 BAH binding to H3K9me3 peptide and biochemical analysis of TNRC18 BAH binding to H3Kc9me3modified nucleosome.a. Crystal structures of the two color-coded human TNRC18 BAH molecules in one asymmetric unit, with the chain identifiers labeled (chain A and B).Each TNRC18 BAH molecule is complexed with one H3K9me3 peptide (chain C or D).The TNRC18 BAH -H3K9me3 complex with the best modelto-map fit (chain B and D) was selected for structural analysis.b.Electrostatic surface view of human TNRC18 BAH bound to the H3K9me3 (yellow sticks).(left) The Fo-Fc omit map of the H3K9me3 peptide, contoured at 1.5σ level, is shown as magenta mesh.(right) The surface patch enriched with basic residues is indicated as the potential binding site to nucleosome/DNA.c.ITC binding curves of TNRC18 BAH against histone peptides with the indicated modification.d.ITC binding curves of the indicated TNRC18 BAH mutants against the H3K9me3 peptide.e. Structural comparison of ORC1 BAH -H4K20me2 (electrostatic surface and stick representation; PDB 4DOW), DNMT1 BAH1 -H4K20me3 (electrostatic surface and stick representation; PDB 7LMK) and BAHCC1 BAH -H3K27me3 (ribbon and stick representation; PDB 6VIL).f.Structural superposition of TNRC18 BAH with the nucleosome core particle (NCP)-bound yeast Sir3 BAH domain (Sir3 BAH ; PDB 4KUD) reveals that the basic patch of TNRC18 BAH (Extended Data Fig. 6b, right panel) corresponds to a similar region of Sir3 BAH involved in binding to the acidic patch of NCP.g.Assessment of purified H3Kc9me3-modified NCP on a native 5% TBE gel.The positions of reconstituted NCP and biotinylated 601 DNA are labeled on the right.h.The BLI kinetic curves for the TNRC18 BAH -NCP binding.The concentrations of TNRC18 BAH used for each kinetic measurement are indicated.The K d value and s.d. were derived from two independent measurements.

Extended Data
Extended Data Fig. 7 | A H3K9me3-binding-defective mutation of TNRC18 BAH leads to the activated expression of the LTR12 family TEs, TINATs and interferon-stimulated genes.a. Sanger sequencing verified homozygous knock-in (KI) mutation of TNRC18 W2858A , introduced by the CRISPR-Cas9-based gene editing, in HEK293 cells.b.RT-qPCR and Western blot of TNRC18 in HEK293 cells, either wild type (WT) or with homozygous mutation of TNRC18 W2858A .Data were plotted as the mean ±s.d. after normalization to the signals of an internal control, GAPDH, and to those of WT cells (n = 3 independent experiments).Vinculin is the sample processing control.c.GSEA revealed the positive correlation between activation of the indicated immunity-related gene sets and the H3K9me3-binding-defective mutation of TNRC18 (W2858A), relative to WT, in HEK293 cells (n = 2 independent experiments).The P value was calculated by a two sided empirical phenotype-based permutation test; the false discovery rate q-value is adjusted for gene set size and several hypotheses testing whereas the P value is not.d-e.Overall expression levels of genes associated with all TNRC18 peaks (d, left), those with TNRC18-bound promoter/TSS regions (d, right), or genes close to (within 50 kb) the TNRC18bound LTRs (e), based on RNA-seq profiling of HEK293 cells, either WT (left) or carrying the TNRC18 W2858A homozygous mutation (right).VST, variancestabilizing transformation.The boundaries of box plots indicate the 25th and 75th percentiles, the center line indicates the median, and the whiskers indicate 1.5× the interquartile range.Sample size of each box plot is listed in Supplementary Table 8.The P value was calculated by two-sided Wilcoxon test.f, Averaged distance to the nearest TNRC18/H3K9me3-bound LTRs from genes exhibiting either down-regulation (left), no expression change (stable; middle) or up-regulation (right) in HEK293 cells carrying the TNRC18 W2858A homozygous mutation, relative to WT, based on RNA-seq.The boundaries of box plots indicate the 25th and 75th percentiles, the center line indicates the median, and the whiskers indicate 1.5× the interquartile range.P value was calculated by two-sided Wilcoxon test.Sample size of each box plot is listed in Supplementary Table 8. g, RT-qPCR of representative treatment-induced non-annotated TSSs (TINATs 40 ) in HEK293 cells carrying the TNRC18 W2858A homozygous mutation, relative to WT (n = 3 independent experiments; *P < 0.05; **P < 0.01, two-sided t-test).Data were plotted as the mean ±s.d. after normalization to signals of GAPDH and then to those of WT.The exact P value is shown in Supplementary Table 8. h.Scatter plot showing Pearson correlation of the CAGE-seq profiles of HEK293 cells, WT or with the TNRC18 W2858A homozygous mutation (n = 2 independent experiments).Rep-1/2, biological replicate 1 or 2. i-j.IGV view of CAGE-seq profiles at the indicated gene (i) or long non-coding RNA (lncRNA; j) containing a nearby LTR12 in HEK293 cells, WT or with the TNRC18 W2858A homozygous mutation.
Extended Data Fig. 9 | TNRC18 binds corepressors, mediating transcriptional repression.a. Scatter plots of the indicated TNRC18-interacting proteins identified by BioID in HEK293 (x-axis) and HeLa cells (y-axis).The mass spectrometry data following BioID in HeLa cells is shown in Supplementary Table 5. b.Pearson correlation plot using the IF signals of the indicated protein in HEK293 cells.c.Structural model of the Sin3A's PAH domain (green) in complex with the TNRC18 peptide (cyan).The structure was predicted via PHYRE2 (http://www.sbg.bio.ic.ac.uk/~phyre2/html/page.cgi?id=index).The model of the complex was generated by the Coot program, using the structure of mouse Sin3A PAH1-SAP25 SID complex (PDB 2RMS) as template.d.Sanger sequencing (top) and Western blot (bottom) for the TNRC18 L760A KI mutation introduced into HEK293 cells.Vinculin is the sample processing control.e. Scatter plot showing the indicated TE families that exhibit significant expression change, based on RNA-seq profiles of HEK293 cells with the TNRC18 L760A mutation versus WT controls.Adjusted P value is calculated by negative binomial model-based methods (DESeq2).f.RT-qPCR of the indicated TEs in HEK293 cells with TNRC18 L760A mutation versus WT controls (n = 3 independent experiments; **P < 0.01; ***P < 0.001; ****P < 0.0001, two-sided t-test).Data were plotted as the mean ±s.d. after normalization to signals of GAPDH and then to those of WT.The exact P value is shown in Supplementary Table 8. g.GSEA revealed enrichment for the indicated pathways in HEK293 cells with the TNRC18 L760A mutation versus WT controls.Immunity-related gene sets are labelled in red.The P value was calculated by a two sided empirical phenotype-based permutation test; the false discovery rate q-value is adjusted for geneset size and several hypotheses testing whereas the P value is not.h.RT-qPCR of the indicated immunity-related genes in WT and TNRC18-L760Amutated HEK293 cells (n = 3 independent experiments; *P < 0.05; ***P < 0.001; ****P < 0.0001, two-sided t-test).Data were plotted as the mean ±s.d. after normalization to signals of GAPDH and then to those of WT.The exact P value is shown in Supplementary Table

Fig. 1 |
Fig. 1 | TNRC18 functions as an ERV repressor.a, CasID identified ERV-bound proteins in HEK293 cells, with hits ranked by the fold change of normalized spectral abundance factor (NSAF) relative to control.Label font colors: green, zinc finger proteins; blue, histone and DNA-methylation-related factors; yellow, histone chaperones; purple, N6-methyladenosine-related factors; red, TNRC18.b,c, Scatter plots showing TE families exhibiting expression changes based on RNA-seq of HEK293 cells with endogenous TNRC18 KD (using TNRC18 shRNA (shTNRC18); b) relative to TNRC18 KD followed by TNRC18 re-expression (shTNRC18-rescue; n = 2 independent experiments), or cells with endogenous TNRC18 KO (TNRC18-KO; c) relative to WT (n = 3 independent experiments).The significance cut-off is the fold change in expression over 1.50 and adjusted P value less than 0.01 for transcripts with base mean read counts over 10.Adjusted P value was calculated using negative binomial model-based methods (DESeq2).d, RT-qPCR for TEs (top) and immunity-related genes (bottom) in HEK293 cells with shTNRC18 versus control shRNA (shControl) or shTNRC18-

Fig. 2 |
Fig. 2 | TNRC18 colocalizes with H3K9me3 at ERV regions.a,c, Box plots showing log 2 fold change values in CUT&RUN signals of the indicated factor, TNRC18 (probed with either endogenous TNRC18 antibodies or anti-GFP antibodies for GFP-tagged TNRC18) or H3K9me3 (probed with antibodies from Abcam or Active Motif (AM)), compared with those from IgG controls, at ERV1 (a, left) including the LTR12 subfamily of ERV1 (c), or LINE1 regions (a, right).The boundaries of box plots indicate the 25th and 75th percentiles, the centre line indicates the median, and the whiskers indicate 1.5× the interquartile range.Sample sizes of each box plot in a-c are listed in Supplementary Table 8.IP, immunoprecipitation. b, Scatter plots showing averaged CUT&RUN signals of TNRC18 and H3K9me3 at the indicated individual ERV subfamily.d, Motif search analysis revealing the most enriched motifs at the called TNRC18 (top) or GFP-TNRC18 (middle) peaks.The bottom panel illustrates LTR12 at the

Fig. 3 |
Fig. 3 | The BAH module of TNRC18 specifically reads H3K9me3.a, Domain architecture of human TNRC18, with the boundaries of individual domains marked.b, ITC binding curves for human TNRC18(BAH) and peptides with different degrees of H3K9 methylation.NDB, no detectable binding.c, Overall structure of TNRC18(BAH) (cyan) bound to H3K9me3 peptide (yellow sticks).The α-helices and β-strands are counted in alphabetic and numeric orders, respectively.d, Positioning of the H3K9me3 side chain (yellow) within the aromatic cage of TNRC18(BAH).The cage residues (pink) of human TNRC18(BAH) are labelled in red.e, Close-up view of TNRC18(BAH)-H3K9me3 peptide interactions.The interacting residues of TNRC18(BAH) and peptide are shown as pink and yellow sticks, respectively.Dashed lines denote a hydrogen bond.f, Schematic of TNRC18(BAH) (black) and H3K9me3 (magenta) interactions.Hydrogen bonds and electrostatic interactions are shown as black and green dashed lines, respectively.Hydrophobic and van der Waals interactions are in yellow.g, Top, close-up view of the interaction between Thr6 of H3K9me3 peptide and TNRC18(BAH).Hydrogen bonds are shown as dashed lines.Van der Waals interactions are shown as solid lines, with distance labelled in angstrom units.Bottom, sequence alignment of the histone H3 sequences around H3K9me3 and H3K27me3.h, Summary of ITC measurements for WT TNRC18(BAH) binding to H3 peptides with the indicated modification.i, Summary of ITC measurement for binding by TNRC18(BAH), WT or the indicated mutant to H3K9me3 peptide.

Fig. 4 |
Fig. 4 | TNRC18(BAH)-H3K9me3 engagement mediates TE silencing.a, RNA-seq identified TE subfamilies exhibiting expression changes in HEK293 cells with a TNRC18(BAH) homozygous knock-in (KI) mutation (W2858A) versus WT (n = 2 independent experiments).Significance cut-off is the fold change in expression over 1.50 and adjusted P value less than 0.01 (calculated using negative binomial model-based methods and DESeq2) for transcripts with base mean read counts over 10. b, GSEA revealed the gene set enrichment in HEK293 cells with a TNRC18 W2858A homozygous mutation versus WT (n = 2 independent experiments).c, RT-qPCR for the indicated TEs or genes in HEK293 cells with a TNRC18 W2858A homozygous mutation versus WT (n = 3 independent experiments; plotted as the mean ± s.d. after normalization to GAPDH and to WT). *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, two-sided t-test.Exact P values are shown in Supplementary Table 8. d, CAGE-seq revealing enrichment for the indicated TEs activated in HEK293 cells with a TNRC18 W2858A homozygous mutation versus WT (n = 2 independent experiments).e,f, IGV view of CAGE-seq signals at LTR12 close to DHRS2 (e) and TNFRSF10B (f) in HEK293 cells, with WT TNRC18 or TNRC18 W2858A homozygous mutation.Rep, replicate.g, Bar chart (top) and Chi-square test (bottom) of the indicated genotype, WT (w/w; n = 51) or with heterozygous (w/m; n = 97) or homozygous (m/m; n = 13)Tnrc18 W2745A/Y2747A mutation, among newly born pups (P0) from breeding of Tnrc18 W2745A/Y2747A heterozygous mice.n, cohort size.h, RNA-seq revealed the indicated TEs that exhibited expression changes in primary lung tissues from mice with a Tnrc18 W2745A/Y2747A mutation (n = 3 mice with two technical replicates each) compared with those of WT littermates (n = 2 mice with two technical replicates each).Significance cut-off is the same as in a, except that the expression fold change is set to 2. Adjusted P value was calculated using negative binomial model-based methods (DESeq2).i, GSEA revealed pathway enrichment in lungs from mice with a Tnrc18 W2745A/Y2747A mutation versus WT.For b and i, immunity-related gene sets are labelled in red.

Fig. 5 |
Fig. 5 | TNRC18 interacts with co-repressors to maintain repression.a, BioID of HEK293 cells identified TNRC18-associated proteins, grouped in different multi-subunit complexes using STRING.b, Immunofluorescence of the indicated protein in HEK293 cells.Representative results from three independent experiments.Scale bar, 10 μm.c, Right, ITC fitting curves of WT or mutant (Mut) TNRC18 amino acids 718-792 titrated against the PAH domain of Sin3.d, Scatter plots showing CUT&RUN signal correlations in HEK293 cells, with Pearson correlation coefficient values indicated at the upper left.e, Plots for the indicated CUT&RUN signals at TNRC18 peaks (n = 7,545; common peaks based on both TNRC18 and GFP-TNRC18 CUT&RUN) in HEK293 cells, WT or TNRC18(W2858A)-mutant.RPKM, reads per kilobase per million mapped reads.Boxes indicate 25th to 75th percentile, whiskers the full data range, and the centre line the median.P values were calculated using two-sided Wilcoxon matched-pairs signed-rank test.****P < 0.0001.Exact P values are shown in Supplementary Table 8. f, Top, averaged H3ac and H3K9me3 CUT&RUN signals

Extended Data Fig. 1 |
Unbiased CasID-based proteomic approach identified a previously less-studied nuclear protein TNRC18 as a putative binding protein at ERV regions.a. Workflow of CasID.A set of sgRNAs targeting representative ERV regions (sgRNA sequence information provided in Supplementary Table

7
) were transduced into cells with stable expression of dCas9-BirA*.The sgRNA-based targeting of dCas9-BirA* to ERVs results in the BirA*-mediated biotinylation of proteins in proximity (within approximately 10 nm) in the presence of biotin.Then, biotinylated proteins were enriched by NeutrAvidin affinity purification, followed by mass spectrometry-based protein identification.b.Representative confocal immunofluorescence (IF) microscopy images showing the nuclear localization pattern of TNRC18 in HEK293 cells, probed with either DAPI (left) or anti-TNRC18 antibodies (middle).Scale bars, 10 μm.Data shown represent 3 independent experiments.c.The cDNA of TNRC18 used for mammalian cell expression, which contains an Avi-tag and a GFP tag at the N-terminus and 3×FLAG tag at the C-terminus.d.Representative confocal IF microscopy images for the above tagged TNRC18 in the HEK293 stable expression cells, co-stained with the mouse anti-GFP antibody and the rabbit antibody against endogenous TNRC18 (top panel), or with the rabbit anti-GFP antibody and the mouse anti-FLAG antibody (middle panel), or with the mouse anti-FLAG antibody and the rabbit antibody against endogenous TNRC18 (bottom panel), all of which exhibited a co-localization pattern in the nucleus.Data shown represent 3 independent experiments.e. Immunoprecipitation (IP) of HEK293 cells expressing GFP-3xFLAG-tagged TNRC18, compared to empty vector (EV)-transduced control cells, by using anti-FLAG beads.The IP sample was immunoblotted with anti-FLAG antibody (top panel) or that of endogenous TNRC18 (bottom panel; for TNRC18 protein size, refer to Methods of IP as well).f.RT-qPCR of TNRC18 in the HEK293 cells with knockdown (KD) of endogenous TNRC18 (shTNRC18), compared to scramble controls (shCtrl), and in the cells with endogenous TNRC18 KD followed by the rescued re-expression of shTNRC18-resistant TNRC18 (i.e., shTNRC18_rescue; n = 3 biologically independent experiments).Data were plotted as the mean ± s.d. after normalization to the signals of an internal control (GAPDH) and to those of the control samples.g.Scatter plot showing the indicated transposable element families exhibiting significant expression change, based on RNA-seq profiles of HEK293 cells with shTNRC18, compared to scramble controls (shCtrl) (n = 2 biologically independent experiments).The cut-off of statistical significance is log2 value of fold-change in expression (y-axis) over 0.58 and adjusted P value (x-axis) less than 0.01 for transcripts with basemean read counts over 10.Adjusted P value is calculated by negative binomial model-based methods (DESeq2).h.Classification of endogenous retroviruses in the human cells.Figure adapted from ref. 89, with permission from Elsevier and under a Creative Commons licence CC-BY 4.0, and ref. 14, BioMed Central.Extended Data Fig. 2 | TNRC18 knock-down (KD) or knockout (KO) results in activation of immunity-related genes in HEK293 cells.a. Sanger sequencing to show frame-shifting mutation and knock-out of TNRC18 in HEK293 cells.b.Western blot to show the knock-out of TNRC18 in HEK293 cells.Vinculin is the sample processing control.c.RNA-seq analysis using unique mapping reads (left) and multi-mapping reads (right) of the indicated HEK293 cells.The boundaries of box plots indicate the 25th and 75th percentiles, the center line indicates the median, and the whiskers indicate 1.5× the interquartile range.

Fig. 3 |
TNRC18 KO using four different human cancer cell lines of epithelial origins.a. Bulk tissue gene expression for TNRC18 in the GTEx Analysis Release V8 90 .Expression values are shown in TPM (Transcripts Per Million), calculated from a gene model with isoforms collapsed to a single gene.The boundaries of box plots indicate the 25th and 75th percentiles, the center line indicates the median, and the whiskers indicate 1.5× the interquartile range.Sample size of each box plot is listed in Supplementary Table 8. b-c.Sanger sequencing (b) and Western blot (c) to show frame-shifting mutation and KO of TNRC18 in the indicated edited cells, in comparison to WT. Vinculin is the sample processing control.For TNRC18 protein size, see Methods as well.d.GSEA revealed enrichment for the indicated pathways in SNU-1 cells (left) and NCI-H23 cells (right) with TNRC18 KO, compared to WT controls.Immunityrelated gene sets are labelled in red.NES, normalized enrichment score.The P value was calculated by a two-sided empirical phenotype-based permutation test; the false discovery rate q-value is adjusted for gene set size and several hypotheses testing whereas the P value is not.Extended Data Fig. 4 | TNRC18 and H3K9me3 co-localize at ERV regions.a. Scatter plot showing correlation between signals of CUT&RUN for endogenous TNRC18 in parental HEK293 cells (x-axis; using anti-TNRC18 antibody) and those for GFP-TNRC18 following its stable expression into the HEK293 cells (y-axis; using anti-GFP antibody).Pearson correlation coefficient is shown.b.Pie chart showing distribution of the indicated TE annotation using the TNRC18 CUT&RUN peaks annotated as TE by ChIPpeakAnno.c. Box plots showing the log2 values of fold-changes in signals of CUT&RUN for the indicated protein, relative to IgG controls, at different TE classes in HEK293 cells.CUT&RUN for TNRC18 was conducted with antibody of endogenous TNRC18 in parental HEK293 cells, or antibody of GFP in cells stably expressed with GFP-TNRC18.CUT&RUN for H3K9me3 was performed by using two independent antibodies, from either Abcam or Active Motif Inc. (AM).The boundaries of box plots indicate the 25th and 75th percentiles, the center line indicates the median, and the whiskers indicate 1.5× the interquartile range.

Fig. 5 |
TNRC18 BAH is a conserved domain binding specifically to H3K9me3.a. Sequence alignment of TNRC18 BAH among different species.The secondary structures of human TNRC18 BAH are indicated on top.The H3K9me3-binding pocket residues are indicated by red asterisks, and the rest of H3-binding sites are indicated by filled black circles at the bottom.b.SDS-PAGE image of the purified recombinant protein of TNRC18 BAH used for biochemical and structural studies.c.ITC fitting parameters of TNRC18 BAH binding to various histone peptides.NDB, no detectable binding.d.ITC fitting curves of TNRC18 BAH against the indicated histone peptides trimethylated at different lysine sites.e. Original ITC binding curves of the recombinant TNRC18 BAH protein against histone peptides with the indicated modification.

8
. i. Heatmap of CUT&RUN signals for TNRC18, HDAC2, TRIM28 and SETDB1 in HEK293 cells, across ±5 kb from the TNRC18 peaks (n = 7545; defined as the common peaks of TNRC18 and GFP-TNRC18, based on CUT&RUN in HEK293 cells).j.IGV tracks showing the CUT&RUN signals of the indicated protein at the reported TRIM28 target gene in HEK293 cells.k.The motif search analysis revealed the binding motifs of KRAB-ZnF proteins to be enriched at the TNRC18 peaks.Extended Data Table 1 | Crystallographic data collection and refinement statistics of the TNRC18_BAH:H3(1-22)K9me3 complex a Values in parentheses are for highest-resolution shell.The dataset was collected from a single crystal.

Table 8 .
f, Top, averaged H3ac and H3K9me3 CUT&RUN signals at LTR12 families (LTR12C, LTR12D and LTR12E), LINEs and SINEs in WT (y axis) or TNRC18(W2858A)-mutant (x axis) HEK293 cells.Bottom, box plots of log 2 values of the ratios for CUT&RUN counts at TEs in TNRC18(W2858A)-mutant versus WT cells.Box boundaries indicate 25th and 75th percentiles, the centre line the median, and whiskers 1.5× the interquartile range.Sample sizes are provided in Supplementary Table8.g, CUT&RUN signals for the indicated factor at a LTR12C in chromosome 14 in WT or TNRC18(W2858A)-mutant HEK293 cells.