Identification of substrates of the small RNA methyltransferase Hen1 in mouse spermatogonial stem cells and analysis of its methyl-transfer domain

Small noncoding RNAs (sncRNAs) regulate many genes in eukaryotic cells. Hua enhancer 1 (Hen1) is a 2′-O-methyltransferase that adds a methyl group to the 2′-OH of the 3′-terminal nucleotide of sncRNAs. The types and properties of sncRNAs may vary among different species, and the domain composition, structure, and function of Hen1 proteins differ accordingly. In mammals, Hen1 specifically methylates sncRNAs called P-element–induced wimpy testis-interacting RNAs (piRNAs). However, other types of sncRNAs that are methylated by Hen1 have not yet been reported, and the structures and the substrates of mammalian Hen1 remain unknown. Here, we report that mouse Hen1 (mHen1) performs 3′-end methylation of classical piRNAs, as well as those of most noncanonical piRNAs derived from rRNAs, small nuclear RNAs and tRNAs in murine spermatogonial stem cells. Moreover, we found that a distinct class of tRNA-derived sncRNAs are mHen1 substrates. We further determined the crystal structure of the putative methyltransferase domain of human Hen1 (HsHen1) in complex with its cofactor AdoMet at 2.0 Å resolution. We observed that HsHen1 has an active site similar to that of plant Hen1. We further found that the putative catalytic domain of HsHen1 alone exhibits no activity. However, an FXPP motif at its N terminus conferred full activity to this domain, and additional binding assays suggested that the FXPP motif is important for substrate binding. Our findings shed light on its methylation substrates in mouse spermatogonial stem cells and the substrate-recognition mechanism of mammalian Hen1.

tion length or truncation at their 3Ј-unmethylated ends, and a considerably reduced abundance, resulting in pleiotropic phenotypes (13)(14)(15)(16)(17). The previously reported Arabidopsis thaliana Hen1 (AtHen1) crystal structure showed that AtHen1 consists of multiple domains: two dsRNA-binding domains (dsRBDs), La-motif-containing domain, protein peptidyl isomerase-like domain, and the C-terminal Rossmann fold methyltransferase (MTase) domain. The MTase domain can bind the product of the methyl donor S-adenylhomocysteine and exhibit MTase activity toward miRNA/miRNA* duplexes in a magnesium (Mg 2ϩ )-dependent manner. In addition, multiple domains in the N-terminal region can contribute to the recognition and binding of the target dsRNA (18).
In contrast to plants, Hen1 homologues in animals catalyze the methylation of piRNAs and single-stranded siRNAs but not small RNA duplexes (10, 11, 19 -22). Mouse Hen1 (mHen1) is specifically expressed in testis and methylates the 3Ј-ends of piRNAs (9,11,21). Drosophila melanogaster Hen1 (DmHen1) uses Ago2-bound single-stranded siRNAs and PIWI-bound piRNAs as substrates. Loss of function of DmHen1 results in shortened length, decreased abundance, and perturbed functions of piRNAs (10). Similarly, the Hen1 homologue in zebrafish also methylates piRNAs in germline cells and is required for oocyte development (22). The substrates of mammalian Hen1 are quite different from the substrates in plants. The substrate of mammalian Hen1 is a single-stranded RNA; thus, the mammalian Hen1 lacks the dsRBD domains that exit in plant Hen1 to facilitate the binding of RNA duplex. The animal Hen1 is relatively small compared with plant Hen1, containing only an N-terminal MTase domain and a C-terminal domain (CTD). The CTD of Hen1 in zebrafish is responsible for the localization of Hen1 to nuage, a perinuclear area for piRNA biogenesis (22). In addition to animals and plants, Hen1 homologues have also been studied in bacteria, in which Hen1 is able to methylate single-stranded small regulatory RNAs and is involved in the bacterial RNA repair system (23,24). The substrates and functions of Hen1 vary in different species, and the domain architectures of Hen1 may vary.
However, for mammalian Hen1 many questions remain unanswered, such as how many types of small noncoding RNAs (sncRNAs) bear 3Ј-end methylation, whether piRNAs are the sole substrate for mammalian Hen1, and the structure and molecular basis of the catalytic mechanism of mammalian Hen1. Here, we investigated the methylation pattern of sncRNAs in mouse spermatogonial stem cells (SSCs) by sodium periodate (NaIO 4 )-assisted deep sequencing. Knockout of hen1 in mouse SSCs resulted in loss of 3Ј-end methylation and shortening of ϳ1 nucleotide of the 3Ј-end of piRNA. We identified a group of noncanonical piRNAs derived from rRNA (rRNAs), small nuclear RNAs (snRNAs), and transfer RNAs (tRNAs), most of which underwent mHen1-dependent 3Ј-methylation. We found a class of tRNA-derived sncRNAs that do not belong to the piRNAs and are ϳ31-33 nt in length, and which are also methylated by mouse Hen1 (mHen1). Furthermore, we determined the crystal structure of the putative MTase domain of HsHen1 in complex with the cofactor AdoMet at a resolution of 2.0 Å. The structure revealed that this domain shares a similar allosteric arrangement as the MTase domain of AtHen1.
Residues composing the active site as well as the FXPP motif at the N terminus of the putative MTase domain were required to methylate target RNAs. In addition, HsHen1 showed a preference of Mn 2ϩ over Mg 2ϩ , similar to bacteria Hen1. Our structural and functional data led to the identification of a series of sites critical for the MTase activity of HsHen1 and also provide insight into the molecular basis for the role of Hen1 in methylation of piRNAs and other sncRNAs in mammals.

Hen1-dependent methylation of piRNAs in SSCs
Spermatogonial stem cells have the capacity for self-renewal to maintain the stem-cell pool and differentiate into spermatocytes, spermatids, and finally spermatozoa. The isolated SSCs are able to grow in vitro and restore fertility after being transplanted into the seminiferous tubules of infertile recipient mice. Moreover, SSCs express high levels of piRNAs and mHen1, which make it an ideal cell model to study the Hen1 substrates. Previous studies in mouse testis showed that 3Ј-end of piRNA was methylated by mHen1 (21,25). However, SSCs are rare in the adult testis (less than 0.1%), and the changes of sncRNAs in the SSCs upon mHen1 knockout were completely masked by the signal from other abundant cells in the testis. To explore the mHen1 substrates in mouse SSCs, we carried out sodium periodate (NaIO 4 )-assisted deep sequencing, a method previously used to investigate piRNA methylation (26,27). We generated mHen1 knockout SSC lines by the CRISPR/Cas9 method as well as Mili (a piRNA-binding protein indispensable for piRNA biogenesis) knockout SSC lines (Fig. 1A). Total RNA from WT and mHen1 Ϫ/Ϫ SSCs were treated with NaIO 4 prior to constructing a cDNA library of the small RNAs. Lacking 2Ј-OH modifications, the hydroxyl groups on the 3Ј-terminal base of the sncRNAs were oxidized into a formyl group, which could not be ligated with a 3Ј adaptor and were eliminated during reverse transcription with a 3Ј adaptor-specific DNA primer during sncRNA cDNA library construction, whereas RNA molecules with a 2Ј-O-methyl modification at their 3Ј-end were unaffected (Figs. 1A and Fig. S1A). Equal amounts of synthetic single-stranded RNA (ssRNA) spike-ins with 2Ј-Omethylation at the 3Ј-end ribose, which were verified by a ␤-elimination reaction, were added into each sample before NaIO 4 treatment for sncRNA quantification (Fig. S1A).
The sequencing results revealed that miRNAs and piRNAs were two major populations of sncRNAs in WT mouse SSCs ( Fig. 1B and Table S1). The piRNAs in Mili Ϫ/Ϫ SSCs were almost diminished, consistent with previous studies that Mili is indispensable for pre-pachytene piRNAs production in mouse testis, whereas the miRNAs remained unchanged (Figs. 1B and Fig. S1B) (28,29). In mHen1 Ϫ/Ϫ SSCs, the abundance of total miRNA was slightly reduced, whereas the relative expression level of individual miRNA after normalization by total miRNAs was not changed, indicating mHen1 knockout may have some secondary effect on miRNA processing or stability. The miRNAs completely disappeared after treatment with NaIO 4 in WT SSC, indicating that the elimination of sncRNAs without 3Ј-end modification by NaIO 4 treatment was complete. By contrast, approximately half of the piRNAs in WT SSCs remained

Structural and functional study of Hen1
after NaIO 4 treatment, indicating that a significant proportion of piRNAs was unmethylated. These methylated piRNAs disappeared in mHen1 Ϫ/Ϫ SSCs upon NaIO 4 treatment, supporting the conclusion that piRNA 3Ј-methylation is Hen1-dependent in mammals (22,25). In contrast to previous studies that mHen1 knockout reduced piRNAs levels in the fly, zebrafish ovary and mouse testis (10, 22, 30, 31), as detected by 5Ј-end labeling of sncRNAs with ␥-32 P followed by PAGE or Northern blotting, the relative abundance of piRNAs increased more than 2-fold in mHen1 Ϫ/Ϫ SSCs as detected by deep sequencing with methylated ssRNA oligos as spike-ins (Fig. 1B). To investigate the cause of this discrepancy, we compared the ligation efficiency of 3Ј-end 2Ј-O-methylated and non-2Ј-O-methylated synthetic ssRNA with the 3Ј adaptor by T4 RNA ligase II. The A, strategies to profile the methylated pattern of sncRNAs in vivo and to determine whether methylation is Hen1-dependent in mouse SSCs. B, composition of miRNAs and piRNAs according to their length distribution in WT SSCs with or without NaIO 4 treatment, mHen1 Ϫ/Ϫ SSCs with or without NaIO 4 treatment, and Mili Ϫ/Ϫ SSC and Mili-IP in WT SSCs. The abundance (count) of miRNAs and piRNAs was corrected by exogenous spike-in normalization, except Mili-IP. The count in Mili-IP represents raw sequencing reads. Mili-IP, immunoprecipitated sncRNAs with Mili-specific antibody. C, trimming and tailing matrix in WT and mHen1 Ϫ/Ϫ SSC; the x axis represents length of the 5Ј genome-matched component (head) of overall piRNAs related sequences; the y axis represents the length of the tail added to the head. The piRNA sequence that could be perfectly matched to genome is considered to have no tail and therefore is positioned at y axis ϭ 0. The area within the circle for each position indicates the relative abundance of overall tailed piRNAs and nontailed piRNAs matching to that particular position. D, percentage of piRNAs and miRNAs with tailing in WT and mHen1 Ϫ/Ϫ SSC. E, percentage of different nucleotides tailed at piRNAs 3Ј-end. F, methylation level of tailed and nontailed piRNAs in WT and mHen1 Ϫ/Ϫ SSC.

Structural and functional study of Hen1
result showed that 2Ј-O-methylation at the 3Ј-end reduced ligation efficiency of ssRNA more than 4-fold (Fig. S1C), indicating that the appearance of a piRNA abundance increase upon mHen1 knockout was likely due to the higher adaptor ligation efficiency of piRNAs without 3Ј-methylation during the cDNA library construction procedure. However, the relative methylation level of different sncRNAs still can be accurately calculated by comparing the sncRNAs read counts with or without NaIO 4 treatment.
Notably, in contrast to a previous study in which the piRNA length has almost no change in the testis of mHen1 knockout mice (25), upon mHen1 knockout in the SSCs, the length of piRNAs became ϳ1 nucleotide shorter than that in WT SSCs (Fig. 1, B and C). This discrepancy indicates that mHen1-dependent 3Ј-methylation may have different functions on the pre-pachytene piRNAs predominantly expressed in the SSCs and the pachytene piRNAs predominantly expressed in the adult mouse testis. Our result in the mouse SSCs is consistent with the previous report in zebrafish upon Hen1 knockout (22) and supports the hypothesis that 2Ј-O-methylation at the 3Ј-end of the piRNAs by Hen1 may contribute to the resistance of piRNAs to 3Ј-exonuclease trimming in germ cells. Moreover, 3Ј-end trimming of piRNAs appears to occur prior to 3Ј tailing (Fig. 1C), consistent with the observations of hen1 mutant in plants (17), indicating that this phenomenon is conserved in plants and mammals. The nontemplate tailing ratio of piRNAs was about half that of the miRNAs in SSCs, and most of the tailing was more than 1 nucleotide (Fig. 1, D and E). The tailing ratio of piRNAs increased more than 2-fold in mHen1 Ϫ/Ϫ SSCs, to ϳ15% of total piRNAs. Interestingly, this increase was mostly contributed by single-nucleotide uridylation and adenylation, which accounted for about 50% of total tailing in mHen1 Ϫ/Ϫ SSCs (Fig. 1E), in agreement with the observation that RNAs without 2Ј-O-methylation at the 3Ј-end are favorable substrates for uridylation and adenylation (32). Compared with the increased tailing of piRNAs by the loss of mHen1, the tailing of miRNAs was not affected, consistent with the lack of 2Ј-O-methylation at the 3Ј-end of miRNAs in mammals (Fig. 1E). Interestingly, those tailed piRNAs could still be methylated by mHen1, even though the methylation level was less than that of the nontailed piRNAs (Fig. 1F). This observation indicates that the 3Ј-methylation of piRNAs may normally take place in a processing step that occurs earlier than tailing in the cells.

Novel sncRNA substrates of mHen1 in mouse SSCs
Beside the canonical piRNAs that generated from piRNA clusters, we found that a portion of piRNAs in SSCs was derived from rRNAs, snRNAs, and tRNAs, which were significantly enriched by anti-Mili immunoprecipitation (Mili-IP) and diminished under Mili knockout (Fig. 2, A-C; Table S1). Notably, ϳ15% of rRNA-derived small RNAs in SSCs were piRNAs with a peak length of 25-32 nt (Fig. 2, A and D), whereas more than half were from 18S rRNA (Fig. 2E). Compared with other rRNA-derived sncRNAs, these rRNA-derived piRNAs had an obvious U bias at the 5Ј-end (Fig. 2F). The highly expressed rRNA-derived piRNAs were located in a few hotspots and had a defined 5Ј-end and a heterogeneous 3Ј-end, which were similar to the canonical piRNAs (Fig. 2G). Approximately one-quarter of snRNA-derived sncRNAs were piRNAs (Fig. 2, B, H, and I), with an even higher U bias at the 5Ј-end than the rRNA-derived piRNAs (Fig. 2J).
By contrast, we found that only 7% of tRNA-derived sncRNAs were piRNAs (Figs. 2C and 3, A and B). These tRNAderived piRNAs had both of G and U biases at their 5Ј-end ( Fig.  3C), which were different from the 5Ј U bias of the Miwi homologue MARWI-bound tRNA-derived piRNAs in marmoset germ cells (33), but were similar to the Miwi2 homologue Hiwi2-bound piRNAs in human somatic cancer cells (34). As expected, the 3Ј-methylation of the majority of rRNA-, snRNA-, and tRNA-derived piRNAs were mHen1-dependent ( Fig. 3, D-F). Interestingly, we found that some tRNA-derived sncRNAs also had mHen1-dependent methylation (Fig. 3, G and H), but their expression levels were not affected in Mili Ϫ/Ϫ SSCs (Fig. 3I) or Miwi2 Ϫ/Ϫ SSCs (Fig. 3J) and were not enriched by Mili-IP (Fig. 3I), suggesting that in addition to piRNAs, other sncRNAs can act as mHen1 substrates in cells. We referred to this new type of small RNAs as mHen1 methylated tRNA-derived small RNAs (hmtsRNAs). These RNAs were ϳ31-33 nt in length, longer than piRNAs in SSCs, and had a more obvious 5Ј G bias than tRNA-derived piRNAs (Fig. 3, A and C). In addition, they were derived from different tRNAs compared with tRNAderived piRNAs. Specifically, Ͼ95% of the hmtsRNAs were derived from four specific tRNA isotypes: Gly-, Lys-, Glu-, and Val-tRNA, whereas tRNA-derived piRNAs were mainly processed from Asp-, Gln-, and Val-tRNA (Fig. 3K). The majority of hmtsRNA sequences started from the 5Ј first nucleotide and ended near the anti-codon loop of their corresponding tRNA (Fig. 3L). Notably, the abundance of hmtsRNA was greatly enriched in mature sperm, accounting for 40% of total small RNAs ( Fig. 3M), which may be functional in the latter stages of spermatogenesis as well as in early embryonic development as reported (35,36).

Human Hen1 exhibits higher activity in the presence of manganese ion over magnesium ion
In addition to the canonical piRNAs, we showed that a portion of tRNA-, rRNA-, and snRNA-derived small RNAs associated with Mili in the SSCs were also methylated by Hen1, indicating the variety of substrates of mammalian Hen1. Hen1 shares a conversed MTase domain across species, although the domain architecture of Hen1 varies in plants, mammals, and bacteria ( Fig. 4A). To investigate the molecular basis of the 3Ј-end methylation by animal Hen1, we attempted to crystallize human Hen1. Human Hen1 is 393 amino acids in length, which consists of an N-terminal MTase domain and a CTD. According to sequence alignment, the MTase domain is conserved across eukaryotic species. The catalytic domain of human Hen1 shares a sequence identity of 35% with that of AtHen1 (Fig. 4B). To understand the mechanism of the sncRNA methylation by Hen1, we first tried to assess its catalytic activity. Although the MTase activity of bacteria Hen1 was well characterized in previous studies, the enzymatic activities of mammalian Hen1 have not been studied yet (37). A 30-nt single-stranded RNA oligo was synthesized for the MTase assays. 3 H-CH 3 -AdoMet was used as the cofactor to measure how many RNA substrates are methylated. Previous studies showed that bacterial Hen1

Structural and functional study of Hen1
exhibited higher activity in the presence of Mn 2ϩ over Mg 2ϩ (37,38). We purified the catalytic domain of Hen1 (HsHen1-MR; residues 21-258) (Fig. 5B). In a reaction containing 100 pmol of RNA oligo and 8 M HsHen1-MR, only 15.6 pmol of RNA was methylated in the presence of 5 mM MgCl 2 over 45 min, whereas under an equimolar concentration of MnCl 2 , ϳ100% RNA (101 pmol as measured) was methylated. The catalytic activity has a huge leap between 1 and 1.5 mM MnCl 2 and

Structural and functional study of Hen1
was basically saturated at 2 mM MnCl 2 . Also, the overall amount of transferred methyl group in the presence of different concentrations of MnCl 2 is much higher than in the presence of MgCl 2 . Therefore, human Hen1 also showed a preference of Mn 2ϩ similar to bacteria Hen1 (Fig. 5C). We performed the following methyltransferase experiments using 2 mM MnCl 2 .

FXPP motif is essential for substrate recognition by HsHen1
To compare the catalytic activity of the full-length Hen1 (HsHen1-FL) with that of the catalytic domain, we purified the full-length protein and two truncated versions, HsHen1-ML (residues 31-258), spanning residues of the methyltransferase domain, and HsHen1-MI (residues 25-258) (Fig. 5, A and B).
Both HsHen1-MR and HsHen1-MI showed a little higher activity than HsHen1-FL in the in vitro methyltransferase assay with triple repeated experiments. The measured activity for fulllength Hen1 is 81%, compared with 100% for HsHen1-MR and 93% for HsHen1-ML, respectively. These data suggest that the CTD domain may have a negative impact on the methyltransferase activity. Surprisingly, HsHen1-ML showed no activity compared with the full-length construct (Fig. 5D). By comparing HsHen1-ML and HsHen1-MI, we found they are only 6 amino acids shorter at the N terminus of HsHen1-ML (Fig. 4B). Sequence alignment indicated that a conserved FXPP motif exists in the N terminus of the MTase domain of eukaryotic Hen1 but not in bacterial Hen1. In the crystal structure of

Structural and functional study of Hen1
AtHen1, the corresponding region (residues 692-697) specifically recognized and anchored the phosphate group connecting the second and the third nucleotides of the 3Ј-end guide RNA via two hydrogen bonds (Fig. S2). Therefore, we proposed that the FXPP motif may play an important role in substrate binding.
Next, we generated three mutants, namely F27A, P29A, and P30A, on the FXPP motif of HsHen1-MR (Fig. 5B). Fluorescence polarization (FP) assays were performed to measure the binding affinity between the RNA substrate and WT or mutant HsHen1-MR. A 26-nt RNA oligo with a fluorescein label at 5Ј-end was used. The measured dissociation constant (K D ) for HsHen1-MR was 0.12 Ϯ 0.01 M. Mutation on the FXPP motif dramatically reduced the binding (Fig. 5E). P29A abolishes the binding. F27A and Pro-30 reduced the binding affinity about 10-and 30-fold, respectively. HsHen1-ML, which lacked the FXPP motif, showed no binding as expected, similar to the bacteria Hen1, although their sequences are not conserved (Fig. 4B) (37). In addition, different lengths of RNA oligos were also tried, including 30, 20, 14, and 8 nt in length. The FP results showed that variation in length has little impact on the binding affinity of HsHen1-MR to RNA oligos (Fig. 5F). Taken together, the FXPP motif was critical for the binding of the RNA substrate and thus indispensable for the enzymatic activity of HsHen1.

Overall structure of the putative MTase domain of human Hen1 in complex with AdoMet
Next, we performed crystallization trials to explore the catalytic mechanism of human Hen1. We made many attempts to crystallize full-length Hen1 (HsHen1-FL) as well as three truncated versions, including HsHen1-MR, HsHen1-MI, and HsHen1-ML (Fig. 5A). However, only HsHen1-ML with its cofactor AdoMet was successfully crystallized. Because HsHen1-ML has no catalytic activity, we call this domain putative MTase domain. The structure was determined by molecular replacement using the structure of the AtHen1 MTase domain (PDB code 3HTX) as a search model. There is only one HsHen1-ML molecule in an asymmetric unit. All of the structural statistics were refined within a reasonable range (Table 1). Overall, the HsHen1 putative MTase domain adopts a classical Rossmann-fold, consisting of eight central ␤-strands (␤1-␤8) surrounded by five ␣-helices (␣1-␣4) (Fig. 6A). In detail, ␤1-␤5 and ␤6 are parallel to each other, whereas ␤7, which is located between ␤5 and ␤6, is antiparallel to the other strands. ␤8 is followed by ␤7 connected by a long loop, which could not be modeled in this structure due to the poor electron density. Helices ␣1 and ␣2 are folded on one side of the ␤-sheet core, and the others are packed on the opposite side. Superimposition of the MTase domain of HsHen1 to AtHen1 (residues 696 -933;

Conserved active sites in human Hen1
Previously, studies have illustrated the catalytic sites in AtHen1 and bacteria Hen1 comprise the AdoMet-binding site, the divalent metal, and the 3Ј adenine nucleotide of the RNA methyl acceptor (18,23). In our crystal structure, the cofactor AdoMet was clearly observed (Fig. 6C). AdoMet was bound in a pocket formed by Tyr-36, Gly-55, Asp-78, Ile-79, Val-115, and Leu-133 (Fig. 6D). The adenine ring of AdoMet was stabilized by the main-chain amides of Gly-55 and Val-115 via two hydrogen bonds. The carboxyl group of Asp-78 forms two hydrogen bonds with the ribose of AdoMet. The carboxyl group of AdoMet was further stabilized by the phenolic hydroxyl group of Tyr-36. Moreover, Ile-79, Val-115, and Leu-133 contact the indole ring of AdoMet through hydrophobic interactions. In the previously reported AtHen1 structure, two glutamic acids and two histidine residues were involved in the chelation of the divalent metal ion (18). However, the divalent metal and the RNA substrate were not observed in our structure. Superimposition of the MTase domains of HsHen1 and AtHen1 showed that in the absence of the divalent ion, the side chains of Glu-132, Glu-135, His-136, and His-181 of HsHen1 flipped away to some degree, especially Glu-132 and His-181 (Fig. 6E). We also noticed that the methyl group of AdoMet in the HsHen1 structure was oriented toward the corresponding position of the 2Ј-OH of the 3Ј-terminal nucleotide in AtHen1 structure with a distance of 1.8 Å, making the transfer of the methyl group from AdoMet to RNA spatially possible (Fig. 6E).
To validate whether these residues are critical for HsHen1 MTase activity, site-specific mutagenesis was performed. Glu-132, Glu-135, His-136, and His-181 are four residues that are supposed to chelate the metal ion (Fig. 6E). Mutations of these residues decreased the methyl transfer activity considerably (Figs. 5B, 6F). H136A and H181A showed significant loss of MTase activity. The measured methylated RNA oligos for H136A and H181A are 28 and 24 pmol, respectively, which are 3.5-and 4.1-fold less effective than that of the WT HsHen1-ML. No activity was observed for the E132A mutant and the double mutant E135A/H136A. Therefore, the residues in the active site of human Hen1 are important for the activity just as in AtHen1 and bacteria Hen1.

Discussion
In this study, we systemically investigated the Hen1-dependent 2Ј-O-methylation of diverse types of sncRNAs in SSCs. In the absence of mHen1, piRNAs were trimmed from their 3Ј-ends by one nucleotide and had increased tailing ratio with one or more additional U and A residues at their 3Ј-end. We also identified a novel class of small RNA substrates of mHen1 in SSC, which are 31-33 nt long and derived from a few types of tRNAs with a significant 5Ј G bias (named hmtsRNAs). We further determined the crystal structure of the putative catalytic domain of human Hen1 with its cofactor AdoMet. We found that similar to bacteria Hen1, HsHen1 prefers Mn 2ϩ over Mg 2ϩ . The methyltransferase assay result indicated that with the N-terminal six additional amino acids containing the eukaryotic Hen1 conserved the FXPP motif, the putative MTase restored full activity as the full-length protein. Site-specific mutagenesis and the binding assays showed that the FXPP motif was critical for the substrate RNA binding. Moreover, our results provided structural and biochemical insights into the catalytic domain of mammalian Hen1 and may shed light on the Hen1 substrates in mouse SSCs.
The hmtsRNAs identified in this study accounted for 36% of the tRNA-derived sequences in the mouse SSCs. Compared with the canonical piRNAs, these hmtsRNAs had lower methylation levels, which may be due to their unfavorable interaction with Hen1 without the assistance of PIWI or its associated proteins. The majority of hmtsRNA sequences 3Ј-ended near the anti-codon loop. Because Hen1 protein can only catalyze methylation at the 3Ј-end of RNAs but not internally, the hmtsRNAs are likely to be initially cleaved from partially processed tRNA precursors in which nucleotides have not yet been modified and then been methylated at 3Ј-end by Hen1. Previous studies have shown that tRNA-derived small RNAs in mature sperm have important functions in epigenetic transgenerational regulation of gene expression, and the 3Ј-methylation by Hen1 may selectively stabilize the certain types of tRNA fragments or facilitate their recognition by some unknown proteins that are functional in spermatogenesis and embryonic development, which warrant future investigation. We also found that in addition to the previously reported tRNA-derived piRNAs (33, 34), a portion of rRNA-and snRNA-derived small RNAs was also associated with Mili in the SSCs, and possibly

Structural and functional study of Hen1
function as piRNAs. Moreover, we observed that most of the rRNA-, snRNA-, and tRNA-derived piRNAs were 3Ј-methylated by Hen1, although their methylation levels were slightly lower than those of canonical piRNAs. A portion of the 3Ј-modification of the rRNA-, snRNA-, and tRNA-derived piRNAs was not dependent on mHen1, suggesting that they may be processed from their precursors bearing modifications or methylated by a yet unknown MTase. We also showed that the N-terminal FXPP motif of the MTase domain is important for the binding of substrate RNAs. Sequence alignment indicated that this motif is conserved in eukaryotic Hen1 proteins, including plant Hen1 and mammalian Hen1. However, the substrates of plant Hen1 and mammalian Hen1 are not the same. The substrate for plant Hen1 is duplex RNAs, whereas the substrate for Human Hen1 is singlestranded RNAs. Our FP results show that the FXPP motif is important for single-stranded RNA binding. By analyzing the structure of AtHen1, we found that in addition to the MTase domain, AtHen1 contains two dsRBD domains, which bind and stabilize the miRNA/miRNA* duplex. However, only the 3Ј-end of the target strand is inserted into the active site. The FXPP motif in AtHen1 interacts with the backbone of the 3Ј-terminal nucleotides of the target strand. Therefore, for AtHen1, the recognition of the substrate RNA can be divided into two parts. Two dsRBD domains bind to the major grooves of the duplex RNA, whereas the FXPP motif binds to the 3Ј-end of the target strand, suggesting the same role of the FXPP motif in AtHen1 and human Hen1.

Structural and functional study of Hen1
Our results showed a similar binding affinity for the singlestranded RNA substrates from 30 to 8 nt, indicating that the binding affinity between Hen1 and its RNA substrate would not be affected by the length of the RNA oligo. Recent studies of piRNA biogenesis revealed a trimming mechanism of piRNA intermediates (40 -43). In silkworms and mice, the production of mature piRNAs required a poly(A)-specific RNase family deadenylase PNLDC1 to trim the 3Ј-end of piRNA intermediates to an optimal length. Hen1 may be coupled with PNLDC1 to add the 2Ј-O-methyl group to the trimmed piRNAs. Therefore, we proposed that Hen1 itself has no capability to distinguish single-stranded RNAs of different size. Mammalian Hen1 is likely to distinguish its substrates in the assistance of other factors in piRNA pathway.

Cloning, expression, and purification of human Hen1
The MTase domain (residues 21-258, 25-258, and 31-258) and full-length domain (residues 1-393) of HsHen1 were amplified by PCR and cloned into the pET-SMT3 vector that contains an N-terminal Ulp1-cleavable 6ϫHis-Sumo tag. Point mutations were performed by using the site-directed mutagenesis kit (New England Biolabs). All of the constructs were verified by sequencing and were expressed in Escherichia coli BL21 (DE3). After being induced with 0.2 mM isopropyl ␤-D-thiogalactopyranoside, the cells were incubated overnight at 18°C. The cells were collected by centrifugation at 4000 rpm for 15 min and were lysed by a cell disruptor (JNBIO) at 4°C. Proteins were purified by affinity chromatography using a His-Trap column (GE Healthcare) and then removed the 6ϫHis-Sumo tag. The proteins were further purified by gel filtration with Superdex G75 Hiload 16/60 column (GE Healthcare) in 10 mM Tris buffer, pH 8.0, 100 mM NaCl, 1 mM DTT.

Crystallization, data collection, and structural determination
Purified HsHen1-ML was concentrated to 15 mg/ml and crystallized by vapor diffusion with the reservoir solution of 100 mM Tris-HCl, 200 mM MgCl 2 , 10% (w/v) PEG 3350, pH 7.5, at 16°C. Diffraction data were collected at BL17U of Shanghai Synchrotron Radiation Facility (SSRF) and processed with HKL 2000 (44). The model was built using the MR method with the structure of the AtHen1 MTase domain (residues 697-933, PDB code 3HTX) as the search model. Model building and structural refinement were carried out using COOT (45) and PHENIX (46). All the structure figures were generated with PyMOL (The PyMOL Molecular Graphics System, Version 1.8 Schrödinger, LLC). The statistics of the diffraction data and the refinement data are summarized in Table 1. The coordinate has been deposited under PDB accession code 5WY0.

In vitro methyltransferase assays
The methyltransferase assays were carried out in a 10-l reaction mixture containing 25 mM Tris-HCl, pH 8.5, 0.1 mM EDTA, 20 M 3 H-CH 3 -AdoMet, 2 mM MnCl 2 with 10 M 30 nt RNA (5Ј-UGUCUGACUGAAGGACCAGGUGCUGUCUGA-3Ј) and 8 M purified protein. WT or mutant HsHen1 proteins were pretreated with 1 mM EDTA on ice for 30 min to remove any endogenous divalent metal ions and then dialyzed against reaction buffer for further methyltransferase assays. The reaction mixtures were incubated at 37°C for 45 min. The reaction was stopped by adding 1 l of 1 mM EDTA. Then the samples were spotted onto a Whatman 3 MM filter disk. The disk was washed twice with 5% TCA solution and three times with ethanol for 5 min per wash. After washing, the disk was completely dried. The radioactivity was measured by liquid scintillation counting (Beckman). For the methyltransferase assays showing that HsHen1-MR prefers manganese over magnesium, 0, 0.5, 1, 1.5, 2, or 5 mM either of MnCl 2 or MgCl 2 as specified were added.

Fluorescence polarization assays
The FP experiments were performed in 96-well plates (Corning) with the fluorescence reader Synergy TM NEO (BioTek). Serial dilutions of purified HsHen1 proteins were prepared in FP assay buffer (100 mM NaCl, 10 mM Tris-HCl, pH 8.0, 1 mM MnCl 2 ) with a final concentration ranging from 0.006 to 6.25 M. FAM-labeled RNA oligos was then added to a final concentration of 4 nM for a final assay volume of 100 l. The reaction mixture was incubated for 30 min at room temperature. Polarization was measured at an excitation wavelength of 485 nM and emission wavelength of 528 nM. Each plate was read three times, and the values were averaged prior to analysis. All dissociation constant (K D ) values were determined by fitting the titration curve with Origin 8.0 software.

Isolation and manipulation of mouse SSCs
SSCs were isolated and cultured according to a previous protocol (47). In brief, testes were dissociated from a 15-day-old C57BL/6ϫDBA2 F1 male mouse carrying transgenic actin-enhanced GFP genes, and single-cell suspensions were obtained. Briefly, ϳ2.0 ϫ 10 5 cells were suspended in 1 ml of StemPro34 SFM medium containing 5 mg/ml BSA, 6 mg/ml glucose, 2 mM glutamine, 1ϫ antibiotic/antimycotic, 1ϫ minimal essential medium vitamins, 1ϫ nonessential amino acids, 10 g/ml biotin, 25 g/ml insulin, 30 g/ml sodium pyruvate, 0.06% lactic acid, 100 M ascorbic acid, 30 nM sodium selenite, 60 M putrescine, 100 g/ml bovine apo-transferrin, 10 M 2-mercaptoethanol, 1% fetal bovine serum, 20 ng/ml recombinant mouse EGF, 10 ng/ml recombinant human basic fibroblast growth factor, and 10 ng/ml recombinant rat glial cell line-derived neurotrophic factor. Cells were cultured in a 12-well plate in a humidified incubator containing 5% CO 2 at 37°C for 24 h. The cells were pipetted 10 times, and the floating cells were collected by centrifugation at 270 ϫ g for 5 min. The pellets were suspended in SF medium and incubated in a humidified incubator for 10 days during which time the medium was refreshed every 3 days. The SSC colonies were trypsin-digested and plated on mouse embryo fibroblast feeder cells in a 24-well plate with SF medium. Plasmids encoding spCas9 and sgRNAs (20 g) targeting mHen1, Mili, and Miwi2 were tran-

NaIO 4 treatment and small RNA library preparation
Total cellular RNA was extracted using TRIzol reagent (Takara). A total of 4 g of total RNA with a 2 ϫ 10 Ϫ6 pmol spike-in mixture and 2 l of freshly dissolved NaIO 4 (200 mM) were kept on ice for 1 h in the dark. Then the RNAs were precipitated and dissolved in DEPC-treated water for a small RNA library construction according to the Illumina protocol. Highthroughput RNA sequencing was performed using HiSeq 2500 with 50 running circles (GENEWIZ). The four spike-in RNA oligonucleotides containing 5Ј-phosphorylation with or without 3Ј-end 2Ј-O-methylation were chemically synthesized (Integrated DNA Technologies, IDT). The sequences were shown as follows: spike-in #1, 5Phos/rCrCrUrGrGrArCrUr-ArGrUrCrGrUrCrArGrCrArUrU, and 3Ј-methylated spike-in #1m, 5Phos/rCrCrUrGrGrArCrUrArGrUrCrGrUrCrArGrUr-GrUmU; spike-in #2, 5Phos/rArArCrUrUrCrArGrGrGrUr-CrArGrCrUrUrGrCrCrG, and 3Ј-methylated spike-in #2m, 5Phos/rUrUrGrUrArArGrArCrArUrGrArArArGrUrArCrUmG.

Immunoprecipitation
In vitro-cultured SSCs were digested and washed with PBS and harvested in lysis buffer containing 50 mM Tris-HCl, pH 7.4, 150 mM NaCl, 1 mM EDTA, 0.5 mM DTT, 0.5% Nonidet P-40, 0.1 unit/l RNase inhibitor (Fermentas), and 1:100 protease inhibitor mixture (Sigma). After rotation at 4°C for 20 min, the lysate was clarified by centrifugation at 14,000 ϫ g at 4°C for 15 min. A total of 400 l of supernatant was mixed with 5 g of Mili antibody (MABE363, Millipore), coupled with protein G beads (Invitrogen), and incubated overnight with gentle rotation at 4°C. The beads were washed four times with TBS before TRIzol reagent (Invitrogen) was added for RNA extraction.

␤-Elimination treatment and Northern blotting
Synthetic RNAs were treated with 1ϫ borate buffer (30 mM borax and 30 mM boric acid, pH 8.6) and 25 mM freshly dissolved NaIO 4 on ice for 30 min in the dark. Then 2 l of glycerol was added to the reaction to quench NaIO 4 and incubated for 10 min. RNAs were precipitated and dissolved in 50 l of 1ϫ borax buffer (30 mM borax, 30 mM boric acid, and 50 mM NaOH, pH 9.5) and incubated at 45°C for 90 min. Finally, the RNAs were precipitated and dissolved in DEPC-treated water. The RNA samples were denatured, fractionated by electropho-resis on a 20% polyacrylamide, 8 M urea gel at 500 V until the bromphenol blue reached the bottom of the gel, and then electroblotted and cross-linked to a nylon membrane (Roche Applied Science). The membranes were probed at 50°C in DIG Easy Hyb buffer (Roche Applied Science) with terminally digoxigenin-labeled DNA oligonucleotides overnight, and then washed with 2ϫ SSC and 0.1ϫ SSC, 0.1% SDS buffer at 37°C. The membranes were incubated with an anti-digoxigenin antibody (Roche Applied Science) (1:20,000 diluted) and then CDP-Star (ABI), after which the signal was detected on X-ray film.

Computational categorization of small RNAs
The raw fastq data were pre-processed using a common procedure. After quality filtering, sequencing reads were clipped from the 3Ј adaptor allowing a minimum match of 10 nt from the 5Ј-end. Reads unable to match the adaptor sequence or with lengths shorter than 17 bp after the 3Ј adaptor was clipped were discarded. The useful reads were mapped to the mouse genome by bowtie (48). The mapped genome sequences were further aligned to known miRNAs, tRNAs, rRNAs, snoRNAs, and snRNAs by bowtie. The remaining 25-32-nt sequences were used to identify piRNAs following the method described previously (49) with slight modifications. The clustering parameters were determined as MinReads ϭ 4 and Eps ϭ 2500 bp by running a series of k-dist analysis with different Eps and MinReads of our own data. All of the candidate clusters that satisfied these parameters were considered as piRNA clusters, and the sequences located in these clusters were defined as piRNA without any further scoring and filtering. Only the reads that exactly matched the 5Ј start site of annotated miRNAs and 3Ј-ends with Յ2-nt deletions or additional sequences derived from pri-miRNAs were counted as miRNAs. The count of each miRNA, piRNA, tRNA-, rRNA-, and snRNA-derived sequences was corrected by the count of methylated spike-ins. The small RNAs were mapped in the following order: miRNA, tRNA, rRNA, snoRNA, snRNA, and piRNA. The rRNA-, tRNA-, and snRNA-derived sncRNAs for which the expression levels decreased more than 10-fold upon Mili knockout in SSCs were identified as piRNAs. The identified piRNA clusters are listed in Table S2. The count of individual miRNAs normalized by methylated spike-ins in each sample is listed in Table S3. The information of noncanonical piRNAs derived from rRNAs, tRNAs, and snRNAs is listed in Table S4. The information of hmtsRNAs is listed in Table S5.

Nontemplate tailing
The sequences unable to be mapped to the mouse genome without any mismatch were used for further analyses of nontemplate tailing. Only those with a 5Ј-end identical to the annotated miRNAs and piRNAs and with consecutive mismatches at 3Ј-end were defined as tailed. The ratio of tailing might have been underestimated because the last residue of some small RNAs that aligned to the genome was in fact added through 3Ј-tailing but was indistinguishable by bioinformatics analysis alone.

Data and material availability
All data used to obtain the conclusions in this paper are presented in the paper and the supporting Materials. The deep sequencing data have been deposited in the National Center for Biotechnology Information under accession number GSE97595. Atomic coordinates and structure factors for the reported crystal structures have been deposited with the Protein Data bank under accession number 5WY0. Other data may be requested from the authors.