GTSF1 accelerates target RNA cleavage by PIWI-clade Argonaute proteins

Argonaute proteins use nucleic acid guides to find and bind specific DNA or RNA target sequences. Argonaute proteins have diverse biological functions and many retain their ancestral endoribonuclease activity, cleaving the phosphodiester bond between target nucleotides t10 and t11. In animals, the PIWI proteins—a specialized class of Argonaute proteins—use 21–35 nucleotide PIWI-interacting RNAs (piRNAs) to direct transposon silencing, protect the germline genome, and regulate gene expression during gametogenesis1. The piRNA pathway is required for fertility in one or both sexes of nearly all animals. Both piRNA production and function require RNA cleavage catalysed by PIWI proteins. Spermatogenesis in mice and other placental mammals requires three distinct, developmentally regulated PIWI proteins: MIWI (PIWIL1), MILI (PIWIL2) and MIWI22–4 (PIWIL4). The piRNA-guided endoribonuclease activities of MIWI and MILI are essential for the production of functional sperm5,6. piRNA-directed silencing in mice and insects also requires GTSF1, a PIWI-associated protein of unknown function7–12. Here we report that GTSF1 potentiates the weak, intrinsic, piRNA-directed RNA cleavage activities of PIWI proteins, transforming them into efficient endoribonucleases. GTSF1 is thus an example of an auxiliary protein that potentiates the catalytic activity of an Argonaute protein.

Argonaute proteins use nucleic acid guides to find and bind specific DNA or RNA target sequences. Argonaute proteins have diverse biological functions and many retain their ancestral endoribonuclease activity, cleaving the phosphodiester bond between target nucleotides t10 and t11. In animals, the PIWI proteins-a specialized class of Argonaute proteins-use 21-35 nucleotide PIWI-interacting RNAs (piRNAs) to direct transposon silencing, protect the germline genome, and regulate gene expression during gametogenesis 1 . The piRNA pathway is required for fertility in one or both sexes of nearly all animals. Both piRNA production and function require RNA cleavage catalysed by PIWI proteins. Spermatogenesis in mice and other placental mammals requires three distinct, developmentally regulated PIWI proteins: MIWI (PIWIL1), MILI (PIWIL2) and MIWI2 2-4 (PIWIL4). The piRNA-guided endoribonuclease activities of MIWI and MILI are essential for the production of functional sperm 5,6 . piRNA-directed silencing in mice and insects also requires GTSF1, a PIWI-associated protein of unknown function [7][8][9][10][11][12] . Here we report that GTSF1 potentiates the weak, intrinsic, piRNA-directed RNA cleavage activities of PIWI proteins, transforming them into efficient endoribonucleases. GTSF1 is thus an example of an auxiliary protein that potentiates the catalytic activity of an Argonaute protein.
In animals, PIWI-interacting RNAs (piRNAs) 21 to 35 nucleotides in length direct PIWI proteins to silence transposons and regulate gene expression 1 . Invertebrates produce piRNAs and PIWI proteins in both the soma and the germline but mammalian piRNAs act only in the germline. Mice that lack any of their three PIWI proteins-MIWI2, MILI and MIWI-or other proteins required for piRNA production are invariably male-sterile [1][2][3][4] . As in other animals, mouse piRNA production requires catalytically active PIWI proteins, MILI and MIWI 5,6 . Transposon silencing is the ancestral function of piRNAs; uniquely, placental mammals also produce pachytene piRNAs, which first appear shortly after the onset of meiosis I [13][14][15][16][17][18] and reach a peak abundance in spermatocytes rivalling that of ribosomes 19 . Pachytene piRNAs tune the abundance of mRNAs required for spermiogenesis, the process by which round spermatids become sperm [20][21][22][23] . Pachytene piRNAs have been proposed to direct MIWI and MILI to cleave specific mRNAs, ensuring appropriate levels of their protein products 20,[23][24][25] . Testing this idea has been thwarted by the absence of a cell-free system in which MIWI or MILI can be loaded with synthetic piRNAs of defined sequence.

Recombinant MIWI loaded with a defined piRNA
We used lentiviral transduction to engineer a stable HEK293T cell line over-expressing epitope-tagged MIWI. Tagged MIWI was captured from cell lysate using anti-Flag antibody coupled to paramagnetic beads, incubated with a synthetic piRNA bearing a monophosphorylated 5′ terminus and 2′-O-methylated 3′ end, and eluted from the magnetic beads using 3×Flag peptide. MIWI loaded with a synthetic piRNA (MIWI piRISC) (Fig. 1a), but not unloaded apo MIWI (Extended Data Fig. 1a), cleaved a 5′ 32  Recombinant MIWI bound stably to an RNA guide bearing a 5′ monophosphate but not to a guide with a 5′ hydroxyl group (Extended Data Fig. 1d). The MID domain of Argonaute proteins contains a 5′ monophosphate-binding pocket that anchors the RNA guide to the protein [26][27][28][29][30] . Mutations predicted to disrupt 5′ monophosphate-binding perturb PIWI function 28,[31][32][33][34][35] . We immobilized MIWI on paramagnetic beads via its 3×Flag tag, incubated it with guide RNA, washed the beads, eluted the MIWI piRISC with 3×Flag tag peptide, and tested its ability to cleave a fully complementary target RNA. Incubation with a 5′ monophosphorylated but not an otherwise identical 5′ hydroxy guide yielded MIWI piRISC that cleaved target RNA (Extended Data Fig. 1d). In vivo, the methyltransferase HENMT1 adds a 2′-O-methyl group to the 3′ ends of piRNAs. Terminal 2′-O-methyl modification likely stabilizes small RNAs against degradation by cellular ribonucleases rather than secures the guide to the protein 1 . Consistent with this function for 2′-O-methylation, piRNAs bearing a 3′ terminal 2′ hydroxyl or 2′-O-methyl were equally functional in directing target cleavage, provided that the piRNA was 5′ monophosphorylated (Extended Data Fig. 1d).

Purified MIWI piRISC is a slow-acting enzyme
Although affinity-purified MIWI loaded with a piRNA bearing 5′ monophosphorylated and 3′, 2′-O-methylated termini specifically cleaved a fully complementary target RNA at the phosphodiester bond that links target nucleotides t10 to t11, the rate of cleavage (0.01 min −1 ) was more than 300 times slower than that catalysed by mouse AGO2 RISC 36,37 (≥3 min −1 ). At physiological temperature (37 °C), mouse AGO2 RISC catalyses multiple rounds of target cleavage 38 . By contrast, 5 nM MIWI piRISC cleaved only around 15% of the target RNA (5 nM) after 1 h (Fig. 1b, Extended Data Fig. 1e and Supplementary Fig. 1). Inefficient target cleavage by MIWI piRISC was not caused by the presence of the amino-terminal 3×Flag-SNAP tandem tag: removing the tandem tags using tobacco etch virus protease generated an untagged protein (Extended Data Fig. 1f and Supplementary Data Fig. 1) whose target-cleavage kinetics were identical to that of piRISC assembled with the tagged MIWI (Extended Data Fig. 1g).
The ubiquitously expressed arginine methyltransferase PRMT5 modifies the amino terminus of PIWI proteins, allowing it to bind Tudor domain-containing proteins, many of which are required for piRNA biogenesis, gametogenesis and fertility [39][40][41] . A potential explanation for the sluggish activity of recombinant MIWI piRISC is that it lacks arginine methylation. We used mass spectrometry to map the positions of methyl arginine in our affinity-purified MIWI. All of the arginine residues previously shown to be methylated in endogenous MIWI immunoprecipitated from mouse testis 39 were methylated in recombinant MIWI produced in HEK293T cells (Extended Data Fig. 1h).
Another possible explanation for the inefficiency of target cleavage by MIWI piRISC is that the recombinant protein, although properly arginine methylated, lacks other post-translational modifications. To test this idea, we immobilized apo MIWI, incubated it with wild-type (C57BL/6) mouse testis lysate in the presence of an ATP-regenerating system at 25 °C for 15 min, removed the testis lysate by washing, and then loaded MIWI with a synthetic piRNA and eluted the resulting piRISC. Pre-incubation of MIWI with testis lysate either before or after loading with a piRNA did not increase its target-cleaving activity (Extended Data Fig. 2a,b). We conclude that neither a missing post-translational modification nor a tightly associated protein partner is likely to explain the low activity of MIWI piRISC compared to AGO2 RISC.

MIWI piRISC requires an auxiliary factor
Adding testis lysate increased the rate of target cleavage by affinity-purified MIWI piRISC around 20-fold ( Fig. 1b and Extended Data Fig. 2b). This effect cannot be attributed to the lysate contributing additional piRISC, because testis lysate alone did not generate detectable cleavage product (Fig. 1b).
MIWI is first produced as spermatogonia enter meiosis and differentiate into spermatocytes 3,42 . To determine whether the MIWI-potentiating factor is differentially expressed during spermatogenesis, we supplemented the target-cleavage assay with lysate prepared from stage-specific germ cells purified using fluorescence-activated cell sorting (FACS). piRISC-potentiating activity was greatest in lysate from secondary spermatocytes (Fig. 1c), a cell type in which pachytene piRNA-directed target cleavage is readily detected in vivo 20,23 . Moreover, the potentiating activity was testis-specific: lysates from brain, liver or kidney did not enhance piRNA-directed target cleavage by MIWI (Extended Data Fig. 3a). Finally, the potentiating activity was specific for PIWI proteins and had no effect on the rate of target cleavage by mouse AGO2 (Extended Data Fig. 3b).
Three lines of evidence suggest that the MIWI-potentiating activity contains one or more structural Zn 2+ ions. First, pre-treating testis lysate with the sulfhydryl alkylating agent N-ethylmaleimide inactivated the potentiating activity (Extended Data Fig. 3c), indicating that reduced cysteine residues, which often bind divalent metal cations 43 , are essential. Second, the MIWI-potentiating activity was unaltered by EGTA, which chelates Ca 2+ , but was irreversibly inactivated by EDTA, which chelates many divalent metals, and by 1,10-phenanthroline, which specifically chelates Zn 2+ (Extended Data Fig. 3d). Adding additional metal ions did not rescue the activity (Extended Data Fig. 3e-g), suggesting that loss of Zn 2+ irreversibly denatures the MIWI-potentiating factor, a characteristic of zinc-finger proteins 44,45 . Third, the MIWI-potentiating activity bound more tightly to an immobilized metal affinity resin charged with Zn 2+ than to resin charged with Ni 2+ (Extended Data Fig. 3h).
To identify the MIWI-potentiating activity, we developed a chromatographic purification scheme using cation-exchange, hydrophobicinteraction and size-exclusion chromatography (Fig. 2a). Notably, the activity eluted from a Superdex 200 size-exclusion column as a single peak at approximately 17 kDa (Fig. 2b). Together, our data suggest that the MIWI-potentiating activity corresponds to a small, testis-specific, Zn 2+ -binding protein abundantly expressed in meiotic and post-meiotic male germ cells (Fig. 2c).
To test whether the MIWI-potentiating activity in the testis lysate corresponded to GTSF1, we used CRISPR-Cas9 to engineer a mouse strain with a 3×Flag-HA epitope tag inserted into the endogenous Gtsf1 coding sequence (C57BL6/J-Gtsf1 em1(Flag)Pdz , hereafter referred to as Gtsf1 Flag ; Fig. 2d). Because the MIWI-potentiating activity was greatest in secondary spermatocytes, we prepared lysate from FACS-purified secondary spermatocytes from homozygous Gtsf1 Flag/Flag male mice. Incubation with anti-Flag antibody coupled to paramagnetic beads depleted the lysate of both epitope-tagged GTSF1 (Fig. 2e) and the MIWI-potentiating activity (Fig. 2f). The activity was recovered from the beads by elution with 3×Flag peptide (Fig. 2f). By contrast, lysate from C57BL/6 secondary spermatocytes retained the MIWI-potentiating factor after incubation with anti-Flag antibody, and no activity was eluted from the beads after incubation with Flag peptide. Thus, GTSF1 is necessary to potentiate the catalytic activity of MIWI. GTSF1 is also sufficient to potentiate the catalytic activity of MIWI. Adding purified recombinant GTSF1 (Extended Data Fig. 4b) to MIWI piRISC increased the rate of target cleavage by MIWI 19-100-fold for three different piRNA sequences (Fig. 3a-c and Extended Data Fig. 4c). First, the addition of 500 nM GTSF1 to MIWI programmed with an artificial, 30-nt piRNA (5 nM) caused the pre-steady-state rate (k burst ) of target cleavage to increase from 0.010 min −1 to 1.1-1.5 min −1 (Fig. 3), a rate similar to that of AGO2 36,37 (≥3 min −1 ). Second, programming MIWI with an endogenous mouse piRNA sequence from the 9-qC-10667.1 locus (pi9) that is antisense to the L1MC transposon increased the pre-steady-state rate of cleavage of a fully complementary target RNA ~19-fold: k burst = 0.033 min −1 for piRISC alone and 0.62 min −1 with GTSF1 added ( Our data suggest that GTSF1 does not promote product release or turnover: the steady-state rate (k ss < 0.005 min −1 ) of target cleavage under multiple-turnover conditions (that is, GTSF1 ≫ target RNA ≫ MIWI) was essentially unchanged from the rate in the absence of GTSF1 (Fig. 3b). MIWI might remain bound to the cleaved products, preventing it from catalysing multiple rounds of target cleavage. Supporting this idea, at incubation times greater than 15 min, 3′-to-5′ exonucleases present in testis lysate degrade the uncut, full-length target RNA, but the 5′ cleavage product remains stable, consistent with its being protected by MIWI bound to its 3′ end (Fig. 1b). In vivo, product release has been proposed to be facilitated in insects and mammals by the RNA-stimulated ATPase Vasa 48-50 (also known as DDX4) and the Vasa-like protein DDX43 51 .

RNA binding is essential for GTSF1 function
Only three eukaryotic proteins are known to contain CHHC zinc-fingers: the spliceosomal RNA-binding protein U11-48K, the TRM13 tRNA methyltransferase, and GTSF1 and its paralogues 52 . In vitro, the first zinc-finger of GTSF1 binds RNA directly; RNA binding requires four basic surface residues 53 (R26, R29, K36 and K39). Potentiation of target cleavage required RNA binding by GTSF1: purified mutant GTSF1(R26A/ R29A/K36A/K39A) (Extended Data Fig. 4b) did not detectably increase the rate of catalysis by either MIWI or MILI (Fig. 3d-f and Extended Data Fig. 4c), suggesting that GTSF1 must interact with RNA to function.

GTSF1 function is evolutionarily conserved
GTSF1 orthologues are found in many metazoan genomes 52 , suggesting that GTSF1 may potentiate target cleavage by PIWI proteins in many animals. Supporting this idea, purified recombinant Gtsf1 from the arthropod Bombyx mori (Extended Data Fig. 4b) potentiated the catalytic activity of the B. mori PIWI protein, Siwi (Extended Data Fig. 6a).
The structure and kinetics of the sponge Ephydatia fluviatilis (freshwater sponge) Piwi (EfPiwi) were recently described 35 . Like MIWI, EfPiwi possesses inherently weak catalytic activity, a feature common to all PIWI proteins examined to date. Although the E. fluviatilis genome is yet to be sequenced, the closely related, fully sequenced genome of Ephydatia muelleri contains a readily identifiable GTSF1 orthologue. Purified recombinant EmGtsf1 (Extended Data Fig. 4c) stimulated the single-turnover catalytic rate of EfPiwi piRISC by around 28-fold ( Fig. 3h). Sponges (Porifera) are the sister group to all other animals, the Eumetazoa, having separated around 900 million years ago. Thus, the last common ancestor of all animals probably required GTSF1 to potentiate target cleavage by PIWI proteins.
The GTSF1 tandem zinc-finger domains are conserved across phyla, whereas the GTSF1 central and carboxy-terminal sequence diverges substantially between mammals and arthropods (Extended Data Table 1). For example, the sequences of the mouse and rhesus macaque first and second zinc-fingers are 100% identical, whereas their C-terminal domains share 88.5% identity. The first zinc-finger of mouse GTSF1 is 37.5% and 45.8% identical to its fly and moth orthologues, but the mouse protein shares just 8% and 8.3% identity with the central and C-terminal domains of the fly and moth proteins, respectively. Consistent with the evolutionary divergence of their central and C-terminal domains, testis lysate from rat or rhesus macaque enhanced target cleavage by MIWI piRISC, whereas lysate from Drosophila melanogaster or Trichoplusia ni ovaries, T. ni Hi5 cells or purified recombinant B. mori BmGtsf1 did not (Extended Data Fig. 6b,c and Supplementary Data Fig. 1).

GTSF1 paralogues can distinguish PIWI paralogues
Many animal genomes encode more than one GTSF protein 52 (Extended Data Fig. 6c). For example, D. melanogaster has four GTSF paralogues. The D. melanogaster OSC and OSS cell lines, which are derived from somatic follicle cells that support oogenesis, express Piwi but lack the PIWI paralogues Aub or Ago3. Piwi-mediated, piRNA-guided transposon silencing in these cells requires Asterix, a GTSF1 orthologue 7,8,10 . In vivo, asterix mutants phenocopy piwi mutants and are female sterile, even though Piwi is successfully loaded with piRNAs and transits to the nucleus in the absence of Asterix 7,8 . Whether the other fly GTSF paralogues have a function in vivo, perhaps as auxiliary factors for Aub or Ago3, remains to be tested. Similar to fly Asterix, mouse GTSF1 is essential for piRNA function and fertility. In mice, two GTSF1 paralogues, GTSF1L and GTSF2 are also expressed during spermatogenesis and interact with PIWI proteins 54 . Unlike GTSF1, single and double Gtsf1l-and Gtsf2-knockout males are fertile 54 . Genes encoding the Gtsf paralogues are syntenic in mammals, whereas Gtsf2 is lost in primates (Extended Data Fig. 7).
The central and C-terminal domains of GTSF orthologues are more similar among closely related species than among GTSF paralogues within the same species (Extended Data Figs. 4b and 6c and Extended Data Table 1), further supporting the view that this domain has evolved Article to bind specific PIWI proteins. Consistent with this idea, mouse GTSF1, GTSF1L and GTSF2 differ in their ability to potentiate target cleavage by MIWI and MILI. Whereas GTSF1 accelerated target cleavage by both MIWI and MILI, purified recombinant GTSF1L and GTSF2 (Extended Data Fig. 4b) efficiently potentiated target cleavage by MIWI but not by MILI (Fig. 3d-f and Extended Data Fig. 4c). GTSF2 was unable to detectably increase the rate of target cleavage by MILI. Although GTSF1L had a modest effect on the rate of target cleavage by MILI piRISC, this enhancement was less than one-sixteenth that provided by GTSF1 and half that of the PIWI binding mutant GTSF1(W98A/W107A/W112A) (Fig. 3d-f and Extended Data Fig. 4c). We conclude that GTSF1L and GTSF2 are specialized to potentiate target cleavage by MIWI piRISC.
Silk moth BmGtsf1-like is more similar to mouse GTSF1 than to BmGtsf1 (Extended Data Fig. 6c and Extended Data Table 1)  test) but had no detectable effect on MILI (Extended Data Fig. 6a), further supporting the idea that the GTSF central domain determines the affinity of the protein for specific PIWI proteins. In vivo, BmGtsf1 interacts with BmSiwi and is required for transposon silencing and sex determination 11 . BmGtsf1 increased the rate of target cleavage by affinity-purified Siwi but not that of the other silk moth PIWI protein, BmAgo3 (Extended Data Fig. 6a).

MIWI and MILI require extensive base pairing
piRNA-target RNA complementarity from g2-g22 (that is, 21 base pairs) is required for efficient target cleavage directed by endogenous piRNAs loaded into MIWI piRISC immunoprecipitated from adult mouse testis 5 . However, GTSF1 does not detectably co-immunoprecipitate with MIWI piRISC from mouse testis 5,39 . A requirement for 21-base-pair complementarity between target and guide is, to our knowledge, unique among Argonaute proteins: fly Ago2 slices a target with as few as 11 contiguous base pairs 55 ; mammalian AGO2 requires only 11 contiguous base pairs for detectable cleavage 56 ; and the eubacterial DNA-guided DNA endonuclease TtAgo 57 requires as few as 14.
Affinity-purified MIWI, programmed with either of two different synthetic 30-nt piRNAs, readily cleaved target RNA complementary to guide nucleotides g2-21 but not a target complementary to g2-g16 (Extended Data Fig. 8a,b). Under multiple-turnover conditions with saturating amounts of purified, recombinant GTSF1 ([GTSF1] ≫ [target] > [piRISC]), MIWI readily cleaved a target RNA with 19 nucleotides (g2-g20) complementary to its synthetic piRNA guide (Fig. 4a, top). The lower background of single-turnover experiments ([GTSF1] ≫ [piRISC] > [target]) enabled longer incubation times. Under these conditions, we could detect GTSF1-stimulated cleavage of a target RNA with as few as 15 complementary nucleotides (g2-g16; Fig. 4a, bottom). We note that the pachytene stage of meiosis in mouse spermatogenesis lasts about 175 h, and the pachytene piRNA pathway components are expressed until at least the round spermatid stage, a time interval spanning more than 400 h.

Guide length limits target cleavage by MIWI
In vivo, piRNAs are trimmed to a length characteristic of the PIWI protein in which they reside: around 30 nt for MIWI and 26-27 nt for MILI 13,14,58 . An attractive hypothesis is that these piRNA lengths are optimal for target cleavage catalysed by the specific PIWI protein.
Our data suggest a more complex relationship between piRNA length, PIWI protein identity and target complementarity. In the presence of GTSF1, MIWI loaded with a 30-nt piRNA and MILI loaded with a 26-nt piRNA readily cleaved a fully complementary target RNA in a 60-min reaction (Fig. 4b). By contrast, neither piRISC cleaved a target complementary to piRNA positions g2-g16 (Fig. 4a,b and Extended Data Fig. 8a,b). Similarly, both MIWI and MILI loaded with a 26-or 21-nt guide produced little cleavage for a target complementary to positions g2-g16, although both guide lengths supported cleavage of a fully complementary target (Fig. 4b). In fact, for MIWI, a 21mer was more active than a 30-nt guide (Fig. 4b). Without GTSF1, MIWI or MILI loaded with any of these guide lengths produced little cleaved target in 60 min (Fig. 4b). Of note, MIWI or MILI loaded with a 16-nt guide RNA, a piRNA length that is not present in vivo, readily cleaved the RNA target (Fig. 4b).
GTSF1 accelerates the rate of pre-steady-state target cleavage (k burst ) by MIWI and MILI but has little effect on their slow rates of steady-state cleavage (k ss ), suggesting that piRISC remains bound to its cleavage by-products (Fig. 3b). We incubated GTSF1 and MIWI-loaded with a 30-, 26-, 21-or 16-nt guide-with a target RNA fully complementary to each guide and measured k burst and k ss (Fig. 4c). As the guide length decreased, k burst decreased and k ss increased. Compared with its native 30-nt guide length, the 16-nt guide increased k ss approximately ninefold and decreased k burst by more than 50-fold. These data suggest that as the guide was shortened, the cleaved products were released more rapidly.
We estimated the binding affinity (ΔG) of the piRNA for its target using the nearest-neighbour rules for base pairing at 37 °C. As the strength of the piRNA base pairing to the target increased, the rate of pre-steady-state cleavage increased (Fig. 5a), consistent with base pairing serving to extract the 3′ end of the piRNA from the PAZ domain, facilitating the transition to a more catalytically competent piRISC conformation 59 . Conversely, decreased base-pairing strength increased the steady-state rate of target cleavage, supporting the view that for biologically relevant piRNA lengths, product release is the rate-determining step for MIWI-catalysed target cleavage (Fig. 5a).

Discussion
In nearly all animals, both piRNA biogenesis and piRNA function require the PIWI endoribonuclease activity. Yet purified mouse MIWI and MILI are intrinsically slow to cleave complementary target RNAs. Our data show that unlike Argonaute proteins, PIWI proteins require an auxiliary factor, GTSF1, to efficiently cleave their RNA targets. a 5 nM af nity-puri ed MIWI piRISC + 0.5 μM rGTSF1 + 3 nM target RNA  Top, multiple-turnover conditions; bottom, single-turnover conditions. All reactions contained saturating amounts of GTSF1. b, Target-cleavage assay using targets complementary to piRNA guide nucleotides g2-g16 or g2-g30 and MIWI or MILI loaded with piRNAs of the indicated lengths, with or without GTSF1 (mean ± s.d., n = 3). c, Absolute and relative pre-steady-state and steady-state rates of cleavage of the g2-g30 target by MIWI loaded with piRNAs of the indicated lengths in the presence of GTSF1 (mean ± s.d., n = 3), E apparent , apparent active enzyme concentration estimated from fitting data to equation (1). For gel source data, see Supplementary Fig. 1.

Article
The ability of GTSF1 to potentiate PIWI-catalysed target cleavage provides a biochemical explanation for the genetic requirement for this small zinc-finger protein in the piRNA pathway. GTSF1 function requires that it bind both RNA and the PIWI protein, and differences among C-terminal domains restrict individual GTSF1 paralogues to specific PIWI proteins.
We propose a testable kinetic scheme for target cleavage by MIWI (Fig. 5b,c and Extended Data Fig. 9) that incorporates the requirement for GTSF1 and the observation that a 16-nt guide changes the rate-determining step of target cleavage catalysed by MIWI. As originally proposed for fly Ago2 59 , piRISC bound to a target is presumed to exist in two states: one in which the piRNA 3′ end is secured in the PAZ domain and a competing, pre-catalytic conformation in which the piRNA is fully base paired to its target. Structures of Piwi-A from the freshwater sponge E. fluviatilis show that extensive pairing between a piRNA and its target induces a catalytically competent conformation in which the PAZ domain is rotated away from the piRNA 3′ end 35 . We propose that the PAZ-bound state is more favourable for guides bound to PIWI proteins than to AGOs, requiring a high degree of complementarity between guide and target to extract the piRNA 3′ terminus from the PAZ domain. Our data also suggest that a slow rate of product release after cleavage results directly from the extensive piRNA-target RNA complementarity required for this conformational change. A 16 nt piRNA allows MIWI to more easily adopt a pre-catalytic state, perhaps because the 3′ end of the short guide cannot reach the PAZ domain. Notably, single-stranded siRNA guides as short as 14 nt allow mammalian AGO3, initially believed to have lost its endonuclease activity, to efficiently cleave RNA targets 60 . In golden hamsters, piRNAs bound to PIWIL1 are initially around 29 nt long, but a shorter population of approximately 23 nt piRNAs appears at metaphase II and predominates in 2-cell embryos 61 . piRNAs bound to PIWIL3, a female-specific PIWI protein that is absent from mice, are around 19 nt long in hamster and around 20 nt long in human oocytes 61,62 . We speculate that these short piRNAs enable PIWIL1 and PIWIL3 to function as multiple-turnover endonucleases.
Our model proposes that GTSF1 recognizes the pre-catalytic piRISC state, facilitating a second conformational change in PIWI proteins that may occur spontaneously in Argonaute clade proteins. GTSF1 binding probably stabilizes the catalytically active conformation, facilitating target cleavage, but has no detectable effect on subsequent release of the cleaved products. Slow product release may be a general property of PIWI proteins: purified E. fluviatilis PIWI similarly catalyses only a single round of target cleavage 35 .
Potentiation of the endonuclease activity of PIWI proteins by GTSF1 probably represents its ancestral function. It remains unknown why GTSF1 is required for the function of PIWI proteins such as mouse MIWI2 6,9 and fly Piwi 7,10 , which silence transcription rather than cleave target RNAs. We propose that GTSF1 stabilizes the functional conformation of non-catalytic PIWI proteins, enabling them to bind the downstream proteins required for transcriptional silencing.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41586-022-05009-0.   (2) GTSF1-dependent conversion of this piRISC pre-catalytic state (E C ) to the fully competent catalytic state (E C′ ). P, cleaved target products. K d , dissociation constant; k c and k −c , rate constants for pre-catalytic state; k on and k off , rate constants for GTSF1 binding; k chem , rate constant for the endoribonucleolytic chemistry step; k′ −c , rate constant for piRISC (EPAZ) regeneration. c, Proposed effects of different piRNA lengths, extent of guide-target complementarity and GTSF1 on the forward and reverse rates of the two conformational rearrangements. The wide, central cleft, as observed in the cryo-electron microscopy structure of E. fluviatilis Piwi-A 35 , is envisioned to allow the central region of a 30 nt piRNA to be mobile and exposed to solvent when its 3′ end is secured to the PAZ domain (top right).

Plasmids and cell lines
Supplementary Data Table 1 provides the sequences of all oligonucleotides used. To create pScalps_Puro_eGFP, an IRES-driven eGFP was inserted downstream of the multiple cloning site and upstream of the puromycin coding sequence of the lentivirus transfer vector pScalps_Puro. MIWI cDNA was obtained from Mammalian Gene Collection (https://genecollections.nci.nih.gov/MGC/). MILI cDNA was amplified by RT-PCR from mouse testis total RNA. Gibson assembly and restriction cloning were used to clone the MILI and MIWI coding sequences into pScalps_Puro_eGFP, fusing them in-frame with N-terminal 3×Flag and SNAP tags. Lentivirus transfer vectors were packaged by co-transfection with psPAX2 and pMD2.G (4:3:1) using TransIT-2020 (Mirus Bio) in HEK293T cells. Supernatant containing lentivirus was used to transduce HEK293T cells in the presence of 16 µg ml −1 polybrene (Sigma) to obtain stable PIWI-expressing cell lines. Three sequential transductions were performed to maximize recombinant protein production. The transduced cells were selected in the presence of 2 µg ml −1 Puromycin for two weeks, then the cells expressing the 5-10% highest eGFP fluorescence were selected by FACS (UMASS Medical School Flow Cytometry Core). Selected cells stably expressing the recombinant PIWI proteins were expanded, collected, and cell pellets flash-frozen and stored at −80 °C. Mouse GTSF1 cDNA was synthesized at Twist Biosciences and cloned into pCold-GST (Takara Bio) bacterial expression vector by restriction cloning. GTSF1 mutants, GTSF1L, and GTSF2-expressing pCold-GST vectors were synthesized at Twist Biosciences. pIZ-Flag-6×His-Siwi was described previously 63 . Flag-tagged MILI and MIWI-expressing vectors were the kind gift of Shinpei Kawaoka (Kyoto University, Kyoto, Japan). MmGtsf1 cDNA was amplified by PCR with reverse transcription (RT-PCR) from mouse spermatogonial stem cell total RNA 64 . BmGtsf1 and BmGtsf1-like cDNAs were amplified by RT-PCR from BmN4 cell total RNA. The amplified cDNA fragment and a DNA fragment coding V5SBP were cloned into pcDNA5/FRT/TO vector (Thermo Fisher Scientific) by In-fusion cloning (Takara). The E. muelleri Gtsf1 coding sequence was obtained from 65 and cloned into bacterial expression vector pSV272 (IJM).

Mice
Generation of 3×Flag-HA-tagged GTSF1 mice (C57BL6/J-Gtsf1 em1(Flag)Pdz ) was performed at Cyagen. The coding sequence for the tags was inserted into the endogenous locus by CRISPR-Cas9. In brief, fertilized mouse embryos were injected with the sgRNA targeting the sequence GTCTT CCATGCTGATGGCAAAGG (PAM), a 3×Flag-HA-tag cassette HDR donor, and Cas9 mRNA. Supplementary Data Table 1 provides the sequences of the HDR donor and oligonucleotide primers used for genotyping. Founder F 0 mice were genotyped and bred to generate F 1 mice carrying the germline-transmitted knock-in allele. Mice were maintained and used according to the guidelines of the Institutional Animal Care and Use Committee of the University of Massachusetts Chan Medical School (A201900331). C57BL/6J mice (RRID: IMSR_JAX:000664) were used as wild-type control, where indicated. Animals were housed in an AALAC-accredited barrier facility with controlled temperature (22 ± 2 °C), relative humidity (40% ± 15%), and a 12 h:12 h dark:light cycle.
GTSF1, GTSF1 mutants and GTSF1 homologues. pCold-GST GTSFexpression vectors were transformed into Rosetta-Gami 2 competent cells (Novagen). Cells were grown in the presence of 1 µM ZnSO 4 at 37 °C until OD 600 ~0.6-0.8, then chilled on ice for 30 min to initiate cold shock. Protein expression was induced with 0.5 mM IPTG for 18 h at 15 °C. Cells were collected by centrifugation, washed twice with PBS, and cell pellets were flash frozen and stored at −80 °C. Cell pellets were resuspended in lysis/GST column buffer containing 20 mM Tris-HCl pH 7.5, 500 mM NaCl, 1 mM DTT, 5% v/v glycerol, and complete EDTA-free protease inhibitor cocktail (Roche). Cells were lysed by a single pass at 18,000 psi through a high-pressure microfluidizer (Microfluidics M110P), and the resulting lysate clarified at 30,000g for 1 h at 4 °C. Clarified lysate was filtered through a 0.2 µm low-protein binding syringe filter (Millex Durapore; EMD Millipore) and applied to glutathione Sepharose 4b resin (Cytiva) equilibrated with GST column buffer. After draining the flow-through, the resin was washed with 50 column volumes GST column buffer. To elute the bound protein and cleave the GST tag in a single step, 50 U HRV3C Protease (Novagen) was added to the column, the column was sealed and incubated for 3 h at 4 °C, following which, the column was drained to collect the cleaved protein.
The protein was diluted to 50 mM NaCl and further purified using a HiTrap Q (Cytiva) anion-exchange column equilibrated with 20 mM Tris-HCl, pH 7.5, 50 mM NaCl, 1 mM DTT, and 5% v/v glycerol. The bound protein was eluted using a 100-500 mM NaCl gradient in the same buffer. Peak fractions were analysed for purity by SDS-PAGE and the purest were pooled and dialysed into storage buffer containing 30 mM HEPES-KOH, pH 7.5, 100 mM potassium acetate, 3.5 mM magnesium acetate, 1 mM DTT, 20% v/v glycerol. To avoid precipitation of GTSF2, which has a pI of 7.3, 20 mM Tris-HCl, pH 8.8, was substituted for HEPES-KOH during purification and dialysis.
For Fig. 3h, EmGtsf1 expression vector was transformed into BL21(DE3) cells (New England Biolabs). Transformed cells were grown in LB supplemented with 1 µM ZnSO 4 at 37 °C until OD600 ~0.6-0.8. The incubation temperature was lowered to 16 °C and protein expression was induced by addition of 1 mM IPTG for 4 h. Cells were collected by centrifugation and cell pellets were flash frozen in liquid nitrogen and stored at −80 °C for future use. Thawed cell pellets were resuspended in lysis buffer (50 mM Tris, pH 8, 300 mM NaCl, 0.5 mM TCEP) and passed through a high-pressure (18,000 psi) microfluidizer (Microfluidics M110P) to induce cell lysis. The resulting lysate was clarified by centrifugation at 30,000g for 20 min at 4 °C. Clarified lysate was applied to Ni-NTA resin (Qiagen) and incubated for 1 h. The resin was extensively washed with nickel wash buffer (300 mM NaCl, 20 mM imidazole, 0.5 mM TCEP, 50 mM Tris, pH 8). Protein was eluted in four column volumes of nickel elution buffer (300 mM NaCl, 300 mM imidazole, 0.5 mM TCEP, 50 mM Tris, pH 8). TEV protease was added to the eluted protein to induce cleavage and removal of the N-terminal His 6 and MBP tags. The resulting mixture was dialysed against HiTrap Dialysis Buffer (300 mM NaCl, 20 mM imidazole, 0.5 mM TCEP, 50 mM Tris, pH 8) at 4 °C overnight. The dialysed protein was then passed through a 5-ml HiTrap Chelating column (Cytiva) and the unbound material collected. Unbound material was concentrated and further purified by sizeexclusion chromatography using a Superdex 200 Increase 10/300 column (Cytiva) equilibrated in 50 mM Tris, pH 8, 300 mM NaCl, and 0.5 mM TCEP. Peak fractions were analysed for purity by SDS-PAGE, and the purest were pooled, concentrated to 150 µM, aliquoted, and stored at −80 °C.

Northern blotting
Northern blotting was performed as described 67 . In brief, piRNA guide standards and PIWI RISCs were first resolved on a denaturing 15% polyacrylamide gel, followed by transfer to Hybond-NX (Cytiva) neutral nylon membrane by semi-dry transfer at 20 V for 1 h. Next, crosslinking was performed in the presence of 0.16 M EDC in 0.13 M 1-methylimidazole, pH 8.0, at 60 °C for 1 h. The crosslinked membrane was pre-hybridized in Church's buffer (1% w/v BSA, 1 mM EDTA, 0.5 M phosphate buffer, and 7% w/v SDS) at 45 °C for 1 h. Radiolabeled, 5′ 32 P-DNA probe (25 pmol) in Church's buffer was added to the membrane and allowed to hybridize overnight at 45 °C, followed by five washes with 1× SSC containing 0.1% w/v SDS. The membrane was air dried and exposed to a storage phosphor screen.

Chromatographic fractionation of the MIWI-potentiating activity
Dissected animal tissues were homogenized in lysis buffer in a Dounce homogenizer using 10 strokes of the loose-fitting pestle A, followed by 20 strokes of tight-fitting pestle B. Lysate was clarified at 20,000g, followed by 0.2 µm filtration to yield an S20 for further chromatographic purification. Lysates used without further purification were directly prepared in 30 mM HEPES-KOH, pH 7.5, 100 mM potassium acetate, 3.5 mM magnesium acetate, 1 mM DTT, and 20% v/v glycerol ('dialysis buffer') with 1× protease inhibitor homemade cocktail; column fractions were dialysed into this buffer before assaying. Protein concentration was measured using the BCA assay. Chromatography buffers were filtered prior to use.
For chromatography, lysate was prepared as described except using 30 mM HEPES-KOH, pH 7.5, 50 mM NaCl, 1 mM DTT, 5% v/v glycerol, and protease inhibitors. The lysate was applied to HiTrap SP column (Cytiva) equilibrated with the lysis buffer. The column was washed, and the bound proteins eluted stepwise using increasing NaCl concentrations. The NaCl content of the SP column fractions containing the peak MIWI-potentiating activity was adjusted to 2 M and applied to HiTrap Phenyl (Cytiva) equilibrated with column buffer containing 2 M NaCl. Bound proteins were eluted stepwise using decreasing NaCl concentrations. The peak MIWI-potentiating fractions elute from the HiTrap Phenyl column were pooled, concentrated (10 kDa MWCO Amicon Ultra filter), and applied to Superdex 200 Increase 10-300 GL size-exclusion chromatography column (bed volume ~24 ml) equilibrated with the dialysis buffer but containing 5% v/v glycerol. The void volume (V 0 ) of the gel filtration column was determined with blue dextran, and all fractions (0.5 ml each), starting from just before V 0 to the end of the column (V t ) were assayed for MIWI-potentiating activity. The molecular mass of the potentiating activity was determined relative to molecular mass markers (beta amylase, 200 kDa; alcohol dehydrogenase, 150 kDa; albumin, 66 kDa; carbonic anhydrase, 29 kDa; and cytochrome C, 12.4 kDa).

Zn 2+ and Ni 2+ immobilized metal affinity chromatography
HiTrap Chelating HP (Cytiva) columns were charged with 0.1 M NiSO 4 or ZnSO 4 , washed with water, and then equilibrated in column buffer (20 mM potassium phosphate buffer, pH 7.5, 500 mM NaCl, 0.5 mM DTT, 5% v/v glycerol). S20 testis lysate was applied to the column, the flow-through was collected, and the column was washed with the column buffer until absorbance at 280 nm stabilized. Bound proteins were eluted in two steps: first, with 20 mM potassium phosphate, pH 7.5, 2 M ammonium chloride, 0.5 mM DTT, 5% v/v glycerol (elution buffer 1), and, second, with 20 mM potassium phosphate, pH 7.5, 500 mM NaCl, 200 mM imidazole, pH 8.0, 0.5 mM DTT, 5% v/v glycerol (elutionbuffer 2). The peak of each step was dialysed into the dialysis buffer and assayed for the ability to potentiate MIWI catalysis.
Radiolabelled target (3-100 nM final concentration) was added to purified PIWI piRISC (2-8 nM), plus ~1 µg tissue or sorted germ cell lysate per 10 µl reaction volume or 0.5 µM (final concentration) of purified GTSF protein. At the indicated times, an aliquot of a master reaction was quenched in 4 volumes 50 mM Tris-HCl, pH 7.5, 100 mM NaCl, 25 mM EDTA, 1% w/v SDS, then proteinase K (1 mg ml −1 final concentration) was added and incubated at 45 °C for 15 min. An equal volume of urea loading buffer (8 M urea, 25 mM EDTA) was added to the reaction time points, heated at 95 °C for 2 min, and resolved by 7-10% denaturing PAGE. Gels were dried, exposed to a storage phosphor screen, and imaged on a Typhoon FLA 7000 (GE Healthcare).
The raw image file was used to quantify the substrate and product bands, corrected for background. Data were fit to the reaction scheme The time-dependence of product formation corresponded to a pre-steady-state exponential burst (k burst = k 2 + k 3 ) followed by a linear steady-state phase, described by k cat , where k cat = k ss = k 2 k 3 /(k 2 + k 3 ).
The affinity (K d ) of wild-type or GTSF1(W98A/W107A/W112A) for MIWI piRISC-target ternary complex and the maximum observable rate (k pot ) were estimated by measuring the pre-steady-state rate of target cleavage (k burst ) of MIWI at increasing concentrations of GTSF1 (1-5,000 nM) and fitting the data 70 to equation (2):  74 .

Mouse germ cell purification
Germ cells from mouse testes were sorted and purified as described 19,20 .
In brief, freshly dissected mouse testes were decapsulated with 0.4 mg ml −1 collagenase type IV (Worthington) in 1× Gey's balanced salt solution (GBSS) at 33 °C for 15 min. The separated seminiferous tubules were treated with 0.5 mg ml −1 trypsin and 1 µg ml −1 DNase I in 1× GBSS at 33 °C for 15 min. Trypsin was then inactivated by adding 7.5% v/v fetal bovine serum (FBS). The cell suspension was filtered through a 70-µm cell strainer, and cells pelleted at 300g at 4 °C for 10 min. Cell staining was performed at 33 °C for 15 min with 5 µg ml −1 Hoechst 33342 prepared in 1× GBSS, 5% v/v FBS, and 1 µg ml −1 DNase I, then the cells were treated with 0.2 µg ml −1 propidium iodide, followed by final pass through a 40-µm cell strainer before sorting. Sorted cells were pelleted at 100g for 5 min, the buffer was removed, and the cell pellets were flash frozen and stored at −80 °C. Cell lysates were prepared as described for HEK293T cells. Protein concentration was estimated using the BCA assay, and an equal amount of total protein from each cell type was used to assay for the ability to potentiate MIWI catalysis.

Analysis of RNA-seq data
Publicly available datasets 17,75-77 were analysed. rRNA reads were removed using Bowtie 2.2.5 with default parameters 78 . After rRNA removal, the remaining reads were mapped to corresponding genomes (mouse, mm10; rat, rn6; macaque, rheMac8; human, hg19) using STAR 2.3 with default parameters that allowed ≤ 2 mismatches and 100 mapping locations 79 . Mapped results were generated in SAM format, duplicates removed and translated to BAM format using SAMtools 1.8 80 . HTSeq 0.9.1 with default parameters was used to count uniquely mapping reads 81 (steady-state transcript abundance was reported in reads per kilobase per million uniquely mapped reads (RPKM)).

Methylarginine analysis
Recombinant MIWI was immunopurified and resolved by electrophoresis on a 4-20% gradient SDS-polyacrylamide gel. The gel was fixed, stained with Coomassie G-250 (Simply Blue, Invitrogen), and the recombinant MIWI band excised and analysed at the UMASS Mass Spectrometry Core. Gel slices were chopped into ~1 mm 2 pieces, 1 ml water added, followed by 20 µl 45 mM DTT in 250 mM ammonium bicarbonate. Samples were incubated at 50 °C for 30 min, cooled to room temperature, then 20 µl 100 mM iodoacetamide was added and incubated for 30 min. The solution was removed, and the gel pieces were three times washed with 1 ml water, 1 ml 50 mM ammonium bicarbonate:acetonitrile (1:1), quenched with 200 µl acetonitrile, and dried in a SpeedVac. Gel pieces were then rehydrated in 50 µl 50 mM ammonium bicarbonate containing 4 ng µl −1 trypsin (Promega, Madison, WI) and 0.01% proteaseMAX (Promega) and incubated at 37 °C for 18 h. Supernatants were collected and extracted with 200 µl an 80:20 solution of acetonitrile: 1% v/v formic acid in water. Supernatants were then combined, and the peptides lyophilized in a SpeedVac and resuspended in 25 µl 5% acetonitrile, 0.1% v/v formic acid and subject to mass spectrometry analysis. Data were acquired using a NanoAcquity UPLC (Waters Corporation) coupled to an Orbitrap Fusion Lumos Tribrid (Thermo Fisher Scientific) mass spectrometer. Peptides were trapped and separated using an in-house 100 µm internal diameter fused-silica pre-column (Kasil frit) packed with 2 cm ProntoSil (Bischoff Chromatography) C18 AQ (200 Å, 5 µm) media and configured to an in-house packed 75 µm internal diameter fused-silica analytical column (gravity-pulled tip) packed with 25 cm ProntoSil (Bischoff; 100 Å, 3 µm) media. Mobile phase A was 0.1 % v/v formic acid in water; mobile phase B was 0.1 % v/v formic acid in acetonitrile. Following a 3.8 µl sample injection, peptides were trapped at flow rate of 4 µl/min with 5% B for 4 min, followed by gradient elution at a flow rate of 300 nl min −1 from 5-35% B in 90 min (total run time, 120 min). Electrospray voltage was delivered by liquid junction electrode (1.5 kV) located between the columns and the transfer capillary to the mass spectrometer was maintained at 275 °C. Mass spectra were acquired over m/z 300-1,750 Da with a resolution of 120,000 (m/z 200), maximum injection time of 50 ms, and an AGC target of 400,000. Tandem mass spectra were acquired using data-dependent acquisition (3 s cycle) with an isolation width of 1.6 Da, HCD collision energy of 30%, resolution of 15,000 (m/z 200), maximum injection time of 22 ms, and an AGC target of 50,000. Raw data were processed using Proteome Discoverer 2.1.1.21 (Thermo Fisher Scientific), and the database search performed by Mascot 2.6.2 (Matrix Science) using the Swiss-Prot human database (downloaded 4 September 2019; https://www.uniprot.org/). Search parameters were: semi-tryptic digestion with up to two missed cleavages; precursor mass tolerance 10 ppm; fragment mass tolerance 0.05 Da; peptide N-terminal acetylation, cysteine carbamidomethylation, methionine oxidation, N-terminal glutamine to pyroglutamate conversion, arginine methylation and arginine demethylation were specified as variable modifications. Peptide and protein validation and annotation was done in Scaffold 4.8.9 (Proteome Software, Portland, OR) employing Peptide Prophet 82 and Protein Prophet 83 . Peptides were filtered at 1% FDR, while protein identification threshold was set to greater than 99% probability and with a minimum of two identified peptides per protein. Only arginine modification sites detected in all three replicates from separate immunoprecipitation experiments are reported in the figure. Figures 1b, 3, 4b,c and 5a and Extended Data Fig. 4a show mean ± s.d. for three independent trials. Figure 2a

Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
All data are available from the authors upon request. Mass spectrometry data are available from MassIVE using accession number MSV000089490.