RISC in Entamoeba histolytica: Identification of a Protein-Protein Interaction Network for the RNA Interference Pathway in a Deep-Branching Eukaryote

ABSTRACT Entamoeba histolytica is a protozoan parasite that causes amebiasis in humans and is a major health concern in developing countries. Our previous work revealed a functional RNA interference (RNAi) pathway in Entamoeba. Several unusual features encompass the RNAi pathway in the parasite, including small RNAs (sRNAs) with a 5′-polyphosphate structure (identified to date only in Entamoeba and nematodes) and the conspicuous absence of a canonical Dicer enzyme. Currently, little is known about the Entamoeba RNA-induced silencing complex (RISC), which is critical in understanding how RNAi is achieved in the parasite. In this study, we examined the RISC of EhAgo2-2, the most highly expressed Argonaute protein in Entamoeba. We identified 43 protein components of EhAgo2-2 RISC with a broad range of functional activities. Two proteins with nucleosome assembly protein (NAP) domains, not previously observed in other RNAi systems, were identified as novel core members of amebic RISC. We further demonstrated the interaction of these NAPs with Ago using an in vitro recombinant system. Finally, we characterized the interaction network of five RISC components identified in this study to further elucidate the interactions of these RNAi pathway proteins. Our data suggest the presence of closely interacting protein groups within RISC and allowed us to build a map of protein-protein interactions in relation to Ago. Our work is the first to elucidate RISC components in Entamoeba and expands the current knowledge of RISC to a deep-branching single-celled eukaryote.

nucleosome assembly protein (NAP) domain. We demonstrated a specific interaction between the two NAPs and Ago using an in vitro recombinant system. Interestingly, NAPs have not been previously reported in other Ago RISC systems. Furthermore, we identified protein components associated with two EhRdRPs and found that more than half of the Ago RISC components are shared with them, suggesting that these proteins may function together to orchestrate and effectively carry out RNAi in the parasite. Finally, we characterized five RISC components and demonstrated the existence of closely bound protein groups within the RISC and built a preliminary protein-protein interaction network in relation to EhAgo2-2. Our work is the first to elucidate Ago RISC cofactors in Entamoeba, and the findings expand current knowledge on RISC to a deep-branching single-celled eukaryote.

RESULTS
Immunoprecipitation of Myc-tagged Ago protein.
To identify RISC components associated with EhAgo2-2, we relied on three previously constructed cell lines: Myc-EhAgo2-2, Myc-EhAgo2-2 DNLS-DR , and Myc-DHFR-EhAgo2-2 PAZ-mut . All three cell lines overexpress an N-terminal Myc-tagged Ago protein (wild type [WT] or mutants). As noted in reference 30, Myc-tagged EhAgo2-2 exhibits strong binding to sRNA and localizes primarily to the nucleus, with a faint cytoplasmic signal. The Myc-tagged EhAgo2-2 DNLS-DR variant exhibits sRNA binding activity but does not localize to the nucleus. Under similar assay conditions, the Myc-tagged DHFR-EhAgo2-2 PAZ-mut variant has a localization pattern similar to that of WT EhAgo2-2 but is impaired in its ability to bind sRNA. Table 1 summarizes the observed features of EhAgo2-2 for these three cell lines. To identify RISC components specifically associated with EhAgo2-2, we included a cell line that overexpresses an irrelevant protein (luciferase without a Myc tag) to serve as a drug selection control. We prepared whole-cell lysates from the four cell lines and performed anti-Myc immunoprecipitation (IP) under the same conditions. Western blotting and silver staining were used to check protein enrichment following IP (examples of EhAgo2-2 WT and the DNLS-DR mutant IPs are shown in Fig. S1 in the supplemental material). We observed a specific pulldown of Myc-tagged Ago protein in the IP sample; actin showed no signal in these IPs, indicating the specific enrichment of the Myc-tagged Ago in the experiment. We performed three biological replicates of IP samples for each cell line. Proteins that were pulled down by IP were released from the beads using SDS-PAGE sample loading buffer, separated by SDS-PAGE, and sent for liquid chromatography-tandem mass spectrometry (LC-MS/MS) analyses after ingel trypsin digestion.
Selection criteria and RISC content analysis for EhAgo2-2. Proteomic analysis of interaction networks often has variability or reproducibility issues, as many factors/ steps are involved during sample preparation and instrumental detection. To limit potential false positives, we used the following experimental parameters to determine the protein candidates (shown in Fig. 1): Byonic software analysis was performed on mass spectrometry raw data with a false-discovery rate (FDR) of 1% to report peptides (36) and inferred proteins from the E. histolytica genome. Each mass spectrometry run typically generated 100 to 150 protein hits for each IP sample. We first separated nonribosomal protein hits from ribosomal hits. Nonribosomal protein hits that appeared in at least 2 of 3 data sets were selected for further analysis. We then compared either the WT Ago or Ago mutants with the Luc control and generated two sublists thereafter: (i) an exclusive list, consisting of protein hits that are in the Ago samples but not in the control, and (ii) an enriched list, consisting of protein hits that overlapped between the Ago sample and the control but with $ 3-fold enrichment of protein hits in the Ago samples based on average spectrum counts. The data analysis workflow for WT EhAgo2-2 is shown in Fig. S2A. There were 61 nonribosomal protein hits detected in at least 2 of the 3 data sets, while the luciferase (Luc) control had 27 nonribosomal protein hits under the same criteria. A comparison of EhAgo2-2 and the Luc control resulted in 41 exclusive protein hits and two enriched protein hits based on the criteria established as described above. Thus, we identified 43 protein hits as components of WT EhAgo2-2 RISC. The 43 identified protein hits were further weighed based on their Byonic score (jlog P valuej [LPV]). The top decoy value was used to separate each sample data set into three categories: high-confidence hits ($12 above the top decoy LPV in at least 2 of 3 data sets, which indicates $ 99% confidence level); moderate-confidence hits (above the top decoy LPV and within its 10.7 to 2 range in at least 2 of 3 data sets, which indicates a confidence level of 80% to 99%); and low-confidence hits (does not meet the above-described two criteria). However, it is important to note that low-confidence protein hits can still identify true partner proteins that associate with RISC. Variables such as low protein expression or limited trypsin digestion sites can contribute to how these proteins are detected and ranked by mass spectrometry, thus affecting their representation in our data sets. Table 2 lists the 43 protein hits as components of WT EhAgo2-2 RISC.
Functional groups represented by EhAgo2-2 RISC components. To better understand the 43 EhAgo2-2 RISC components identified, we investigated their potential cellular function(s) based on current genome annotations or functional domains (InterPro domain in AmoebaDB). Figure 2A diagrams the functional annotations of these RISC components using a colored bubble plot. The protein hits are annotated with a diverse array of functions, including transcriptional and translational regulation, sRNA loading, RNA binding, vesicle trafficking, and protein folding and modification, and may be FIG 1 Schematic of experimental design and mass spectrometry analysis and its data processing criteria in this study. Each IP sample, performed in triplicates, was processed by in-gel trypsin digestion for LC-MS/MS. Raw data were analyzed by the Byonic tool for the identification of protein hits. Stepwise selection criteria, performed as listed, were used to prioritize protein hits. Zhang et al. involved in cell signaling and metabolic pathways. Interestingly, several conserved RISC components associated with mammalian systems, such as Dicer, transactivating response RNA-binding protein (TRBP), GW182/TNRC6B, or exportin 8, were not identified in our study. This is in agreement with our in silico bioinformatic search of the Entamoeba genome; these mammalian RISC components are conspicuously absent from the Entamoeba genome (29). However, considering that the E. histolytica genome is not well characterized, with more than half of the predicted genes being hypothetical (53%), we cannot definitively rule out the absence of canonical RISC machinery found in model organisms. The other RNAi machinery components encoded by the E. histolytica genome are EhRdRPs (EHI_139420, EHI_179800, and EHI_086260) and EhAgos (EhAgo2-1, -2-2, and -2-3) (29). However, none of these known RNAi proteins were identified in our EhAgo2-2 RISC data set, possibly due to low expression of these  S2A in the supplemental material). High-confidence hits are in green ($2 above top decoy score, with $99% confidence level); moderate-confidence hits are in light green (above top decoy score but in the range 10.7 to 2, with confidence level 80% to 99%); low-confidence hits are in gray (hits that do not meet the two criteria). Protein hits identified either by all three data sets (3/ 3) or 2 of 3 data sets (*) are listed. Our mass spectrometry analyses for two functional EhAgo2-2 mutants (Table S1) show RISC components can be affected by mutants, with some protein hits being absent or showing association level change: either weakening (from green in WT to gray in mutant) or strengthening (from gray in WT to green in mutant) under each mutant condition. (Continued on next page) Zhang et al. proteins (EhRdRPs and two other EhAgos) in the parasite (29). Alternatively, EhRdRPs or EhAgos could form separate functional complexes outside the EhAgo2-2 RISC. Tudor-like protein (EHI_005150, a hypothetical protein with a Tudor domain by E value 2.1e210) and RecQ helicase (EHI_028890) were identified in our analysis and could be potential homologs of known RISC components in model systems. The Tudor domain protein is a highly conserved Ago/Piwi RISC component found in C. elegans, Drosophila, and mammals (37). QDE-3, one of the RecQ family proteins, is an important RNAi factor involved in quelling in Neurospora. The rat rRecQ-1, a homolog of QDE-3, is found to be associated with Piwi RISC (18). The presence of two highly conserved RISC components in our data set validates our approach toward revealing the true interactome of RISC components in E. histolytica. Further studies are needed to elucidate the exact functions of E. histolytica Tudor-like protein and RecQ helicase and their roles in the RNAi pathway.
EhAgo2-2 RISC contains a set of chaperone proteins (Hsp70, DnaJ homolog subfamily A member 1, and chaperonin 1) reminiscent of RISC chaperone machinery proteins (Hsc70/Hsp90 and others) found in model systems (38). RISC formation often involves multiple steps, such as loading of sRNA duplexes, unwinding, and passenger strand cleavage (39). Identification of chaperone machinery proteins in EhAgo2-2 RISC indicates that a similar process of sRNA loading may exist in the parasite. However, Hsp90 is not seen in our data set despite being highly expressed in E. histolytica (40). During heat stress, E. histolytica parasites upregulate many chaperone proteins, including Hsp90 and multiple Hsp70s, as well as cofactors Hsp40 (DnaJ), BAG-1, and CHIP (41). The EhAgo2-2 RISC chaperones found in this study are different from those found during heat stress, suggesting that the two different biological events/processes involve different sets of chaperone proteins.
EhAgo2-2 RISC contains two NAP domain proteins: EHI_072030 and EHI_120340 (which we will refer to as NAP07 and NAP12, respectively). It is worth noting that these two proteins were among the top-ranked protein hits in all three replicates of EhAgo2-2. NAP domain proteins, functioning in nucleosome formation or as histone chaperones, have not been previously identified in other RNAi systems (42). Searching the E. histolytica's genome revealed 11 potential proteins with annotated NAP domains. Phylogenetic comparison and alignment analysis using Clustal Omega showed the divergence of EhNAPs from NAPs found in model systems (see Fig. S3A). Compared with yeast NAP1 (yNAP1), all EhNAPs are relatively short, lacking ;60 residues at the N terminus (similar to Caenorhabditis elegans NAP1 [CeNAP1]) (Fig. S3B). Some EhNAPs, including NAP07 and NAP12, appear to also lack the acidic C-terminal tail ( Fig. S3B and C). The acidic C-terminal tail is a unique characteristic of NAP proteins which contributes to histone binding (43). Future studies will be required to examine the function of NAP07 and NAP12 in Entamoeba, specifically their activity pertaining to RNAi. These studies could also elucidate NAP07 and NAP12's potential involvement in histone binding or nucleosome formation.
Lastly, our Ago RISC analysis indicated that the known transcription factor URE3-BP (upstream regulatory element 3-binding protein) (EHI_060740) is a major EhAgo2-2 RISC component. URE3-BP was previously identified as a calcium-responsive regulator for two virulence genes: Hgl5 (Gal/GalNAc lectin) and Fdx1 (ferredoxin) (44). URE3-BP Protein functions are based on either current gene annotation in the genome or InterPro protein domain indicated in AmoebaDB. Protein hits identified in this study exhibit a broad range of biological activities, including transcriptional and translational regulation, sRNA loading, and cell signaling. There are two proteins with NAP domains (shown as thick lines) which, interestingly, have not previously been observed in RISCs from other studied systems. RecQ helicase and Tudor-like protein are the only two protein hits that can be linked to known RNAi factors found in model systems. Both the RecQ helicase and Tudor-like protein were identified as low-confidence hits, as shown in the dashed-line box. (B and C) Three-way comparison of RISC components among EhAgo2-2 WT and its two mutants. Mass spectrometry analyses pinpointed the RISC components affected in these two functional mutants. There are 23 protein hits found in all three samples, suggesting that they are likely core components of Ago RISC. Each mutant also shows a small set of proteins whose association with RISC was affected: four protein hits are PAZmut specific (overlap between WT and DNLS-DR mutant); 10 protein hits (notably URE3-BP and Tudor-like protein) are DNLS-DR mutant specific (overlap between WT and PAZ-mut); six protein hits are absent in both mutants.
RISC Components in E. histolytica broadly regulates parasite virulence, as demonstrated by transcriptome profiling both in vitro and in vivo (45,46). Thus, the amebic RNAi pathway likely participates in the regulation of genes that control parasite virulence.
RISC components of two Ago mutants. Our study also included two functional EhAgo2-2 mutants to pinpoint the effects of these mutants on the RISC contents. These two mutant proteins have either impaired nuclear localization (Myc-EhAgo2-2 DNLS-DR ) or impaired sRNA binding (Myc-DHFR-EhAgo2-2 PAZ-mut ) (30). Using a similar strategy of IP mass spectrometry analysis, we found 32 exclusive protein hits and one enriched hit for EhAgo2-2 DNLS-DR RISC ( Fig. S2B and Table S1A) and 98 exclusive hits and 6 enriched hits for the EhAgo2-2 PAZ-mut RISC ( Fig. S2C and Table S1B).
We performed three-way comparisons of the RISC data sets (shown in Fig. 2B and C) and identified 23 protein hits shared among all three data sets. These 23 proteins that were unaffected by either mutant are likely bona fide components of Ago RISC. Furthermore, these 23 proteins were cross-confirmed by nine mass spectrometry data sets in our study. Among these hits, the two NAPs (NAP07 and NAP12) remained as high-confidence hits in all three samples, indicating a strong association to Ago. There were four proteins affected by the PAZ mutant (overlap between the WT and the DNLS-DR mutant but not the PAZ-mut mutant) and 10 proteins affected by the DNLS-DR mutant (overlap between WT and PAZ-mut but not DNLS-DR), as listed in the Fig. 2C. Finally, six proteins were absent in both mutants, including DnaJ family protein and DNA repair protein (EHI_114930).
We noticed several protein hits against WT Ago RISC were ranked differently under the two mutant conditions, which could be interpreted as association-level changes within RISC. We observed six high-confidence protein hits in the WT (two malic enzymes, two DnaJ proteins, MIR domain, and URE3-BP) were absent or became low-confidence hits in the two mutants. The absence of URE3-BP in both mutants is interesting, as this transcription factor is a key regulator for amebic virulence. The EhAgo2-2 DNLS-DR mutant had a substantial effect on the Ago RISC composition: redox proteins were altered (all six low-confidence ranked peroxiredoxin hits became high/medium hits in the mutant), and several high-confidence hits were lost, including enhancer binding protein-1 (EBP-1), Hsp70, and elongation factor 1-alpha 1. For the EhAgo2-2 PAZ-mut mutant, we noticed that it contained many hits associated with proteins involved in the ubiquitination pathway (Table S1B). This observation could be due to the overexpressed mutant protein triggering a protein degradation pathway in the parasite.
Cross-validation of Ago RISC components by tagging additional RNAi pathway protein. To validate the identified EhAgo2-2 RISC components, we examined other known RNAi pathway proteins in E. histolytica. Among RNAi pathway proteins found in this parasite are Agos and RdRPs. The RdRP proteins function in the RNAi gene silencing pathway and are responsible for the biogenesis of secondary sRNAs in yeast, C. elegans, and plants (35). It has been shown that RdRPs are tightly linked to Ago RISC in yeast and C. elegans by mass spectrometry analysis (15,47). Thus, we selected two EhRdRPs (EHI_139420 and EHI_179800) and overexpressed Myc-tagged EhRdRP proteins and confirmed their expression by Western blotting (see Fig. S4A). Immunofluorescence assay (IFA) analysis (Fig. 3A) of Myc-EhRdRP1 shows whole-cell staining, while Myc-EhRdRP2 shows nuclear staining with a faint signal in the cytoplasm, a pattern similar to that in EhAgo2-2 cells (31). We performed IP and mass spectrometry analysis for the Myc-tagged EhRdRPs (Fig. S4A) and found 48 exclusive protein hits in the EhRdRP1 sample and 41 exclusive protein hits in the EhRdRP2 sample ( Fig. S4B and C; Table S2). We noticed that the two EhRdRP data sets overlap extensively. Both EhRdRP data sets also overlap the EhAgo2-2 RISC data set, with greater than 50% protein hits (23/43), as shown in Fig. 3B. This suggests that these protein complexes could function together to carry out RNAi in E. histolytica. These shared components are listed in Fig. 3C, including RNA binding proteins, chaperones for sRNA loading, and RNAi homologs (RecQ helicase and Tudor-like protein). It is worth noting that EhRdRP1 possibly interacts with another Ago protein (EhAgo2-3), indicating a broader RNAi network among all three Ago proteins in the parasite. Of note, we did not identify EhAgo2-1 in these IP samples, but it is worth noting that this protein exhibits very low expression, which may have limited its identification by IP and mass spectrometry.
To reveal the protein-protein interactions of the Ago RISC components, we chose five protein hits found within this study for further characterization. NAP07 and NAP12 appear to be novel and unique to the parasite RISC. EBP-1 contains an RNA-binding motif, which could be important for sRNA or RISC-targeted RNAs. Guanine nucleotidebinding protein subunit beta 2-like 1 (GNB2L1), also known as receptor for activated C kinase 1 (RACK1), belongs to the tryptophan-aspartate repeat (WD-repeat) family proteins and functions as a master regulator for multiple cellular processes in other systems (48). Lastly, Tudor-like domain protein was also selected, as it is likely a homolog component of Ago RISC in model systems.
Similar to the approaches used before, the five selected genes were cloned into the same plasmid vector for Myc-tagged overexpression. After transfecting the constructs into parasites and confirming the expression of each Myc-tagged protein, we then tested if endogenous EhAgo2-2 can be pulled down in a co-IP experiment using a ready-made custom antibody against EhAgo2-2 (the antibody was generated and validated in our previous study [31]). Figure 4 and Fig. S5 show that EhAgo2-2 was only detected in the Myc-NAP12 co-IP sample (Fig. 4) and not in the co-IP samples of the other four proteins (Fig. S5). This indicated that a direct association of NAP12 to EhAgo2-2 can be verified under the co-IP experiment conditions.  We characterized five additional protein hits from our initial Ago RISC analyses. We performed anti-Myc co-IP experiments for each of the five Myc-tagged protein cell lines. EhAgo2-2 was detected in the co-IP for Myc-NAP12 using a custom-generated anti-EhAgo2-2 antibody. EhAgo2-2 signal can be detected in the anti-Myc IP sample for Myc-NAP12 but not in the IP samples for Myc-NAP07 or for the other three Myc-tagged proteins (see also Fig. S5 in the supplemental material). Due to low protein loading, a faint EhAgo2-2 signal was detected only upon overnight exposure for input lanes (data not shown). Anti-Myc Western blotting detected signals for Myc-tagged NAP07 and NAP12 proteins at their expected sizes. The membrane was also probed with an anti-histone 3 antibody, which showed signal in all input lanes but not in the IP lanes.
As mass spectrometry analysis can provide a sensitive means to broadly identify potential interacting proteins from an IP sample, we performed two independent IPs for each of the five Myc-tagged proteins to identify the interacting network among these RISC components. We set the initial selection criteria to require two of both replicates for each sample and focused on exclusive hits after comparison with the Luc control. We then compared the exclusive hits from each tagged protein data set to the EhAgo2-2 RISC hits. For example, the Myc-NAP07 mass spectrometry analysis (shown in Fig. S6A) generated 75 exclusive protein hits after removal of the Luc control. There were 19 protein hits that overlapped between Myc-NAP07 and EhAgo2-2 RISC data sets, as listed in the inset table of Fig. S6A (with the high-confidence hits as green). The analyses for NAP12, EBP-1, GNB2L1, and Tudor-like domain protein IP samples are shown in Fig. S6B to E.
To summarize the overall findings of the mass spectrometry analyses of these five Myc-tagged proteins, we listed the major protein-protein interacting relationships with EhAgo2-2 RISC in Fig. 5. (i) Each of the five selected proteins had many proteomic hits In addition to previous EhAgo2-2 and EhRdRP cell lines, mass spectrometry analyses were performed for each cell line overexpressing five putative Ago RISC components. Mass spectrometry analysis for each cell line was performed in duplicates. The detailed Venn diagram analysis is shown in Fig. S5. The major protein-protein interactions can be summarized as follows: EhAgo2-2, NAP07, and NAP12 form a closely interacting group (in brown circle) with reciprocal pulldowns among three proteins, while a second interacting group (in gray circle) consists of EBP-1, Tudor-like protein, and GNB2L1, with reciprocal pulldowns among these three proteins. The two EhRdRP complexes interact with EhAgo2-2 RISC component proteins of EBP-1, Tudor-like protein, and GNB2L1. EhRdRP1 and EhAgo2-2 both interact with RecQ helicase, a RISC homolog identified in this study. Other interesting RNA binding protein hits are EhAgo2-3 and poly(A) binding proteins in the EhRdRP1 data set and EBP-2 and NBP-2 in the EhRdRP2 data set. Single arrow indicates one-way pulldown; double arrow indicates reciprocal pulldown. Dashed circles indicate proteins are grouped as closely interacted proteins. Thick arrow lines indicate highconfidence hits, thin arrow lines indicate low-confidence hits, and * indicates a recombinant protein was generated in a bacterial system and the protein interaction was confirmed (see Fig. 6).
RISC Components in E. histolytica ® (including some found in the EhAgo2-2 RISC), indicating that each of these five proteins could have broad cellular functions in addition to their involvement in RNAi. (ii) Two NAP proteins reciprocally pulled down each other, indicating a strong binding of two NAPs. In addition, both NAPs pulled down EhAgo2-2. (iii) EhAgo2-2 and the two NAP proteins interacted with EBP-1, GNB2L1, and Tudor-like proteins. (iv) The EBP-1, GNB2L1, and Tudor-like proteins exhibited reciprocal pulldown with each other. Thus, we have identified two groups of closely interacting proteins within the Ago RISC (circled in brown and gray in Fig. 5). (v) Two EhRdRPs interacted mainly with three RISC components (EBP-1, GNB2L1, and Tudor-like proteins) (the brown circle in Fig. 5). Overall, our data define the major interactome of Ago RISC components and establish a protein-protein interaction map for these RISC components in relation to Ago.
In vitro assays confirm a direct interaction between two NAP proteins. To validate a direct physical interaction between the two NAP proteins, we expressed the two NAPs in a bacterial system, with both N-terminal glutathione-S-transferase (GST) tags and His tags. All four recombinant proteins showed prominent bands by SDS-PAGE at their expected sizes upon induction with isopropyl-b-D-thiogalactopyranoside (IPTG) (see Fig. S7A). Using the solubilization and purification conditions outlined in Materials and Methods, we were able to purify both GST-tagged and His-tagged NAP proteins (Fig. S7B). To test if there is a specific interaction between two NAP proteins, we performed a GST pulldown assay. GST-tagged NAPs were used as "bait" proteins and incubated with bacterial lysates containing "prey" His-tagged NAPs. Figure 6A shows a specific pulldown of His-NAP12 using bait GST-NAP07 and pulldown of His-NAP07 using bait GST-NAP12. The controls showed no interaction with any of these NAP proteins (additional controls are negative, as shown in Fig. S7C, including a pulldown assay using bait GST tag or pulldown for irrelevant His-tagged ERM-BP protein [49] lysate). Thus, a direct interaction between NAP07 and NAP12 can be confirmed using bacterial expression systems.
We further set out to test if GST-tagged NAPs can be used as bait proteins to pull down EhAgo2-2 from amebic cell lysates. In this experiment, GST-tagged NAP07 or NAP12 was immobilized onto glutathione-agarose beads as bait proteins. The beads were then incubated with parasite whole-cell lysate prepared from a Myc-EhAgo2-2 overexpression cell line. In addition, we included a prebound NAP bait sample (GST-NAP12 incubated with His-NAP07 lysate to form prebound NAPs of GST-NAP12 with His-NAP07). We analyzed the eluted proteins from the beads by silver staining, which showed prominent bait proteins for each sample at their expected sizes (Fig. 6B, bottom). To check if the prey Myc-EhAgo2-2 was present, we probed the eluted samples by Western blotting using an anti-Myc antibody. We observed Myc-EhAgo2-2 signal in the bait sample lanes of NAP12, NAP07, and prebound NAPs; Myc-EhAgo2-2 was not observed in the control samples. Thus, we have confirmed the interaction between EhAgo2-2 and the two NAPs using a recombinant protein approach.

DISCUSSION
In this paper, we characterized E. histolytica Ago RISC by IP mass spectrometry and identified 43 protein hits as components of EhAgo2-2 RISC. These 43 protein components have a broad range of functions based on in silico analyses and gene annotations. Analyses of two functional EhAgo2-2 mutants (with alterations in small RNA binding and nuclear localization) demonstrated that Ago RISC contents are affected under each mutant condition. Our main findings from this study can be summarized as follows: (i) EhAgo2-2 RISC lacks several known RNAi factors found in typical siRNA/ miRNA pathways but contains RecQ helicase and Tudor-like protein, which are reminiscent of Piwi RISC; (ii) EhAgo2-2 RISC contains two unique proteins with NAP domains, and their physical interaction with Ago was confirmed using an in vitro recombinant system; (iii) EhAgo2-2 RISC contains a major parasite virulence transcriptional factor; (iv) analyses of EhRdRPs, additional RNAi pathway proteins, identified significant overlap with EhAgo2-2 RISC content; and (v) characterization of five selected proteins from FIG 6 In vitro pulldown assay confirms a direct interaction between two NAP proteins and their interaction with EhAgo2-2. (A) A physical interaction exists between recombinant NAP proteins by GST pulldown. The GST pulldown assay was performed using glutathione beads to bind "bait" GST-tagged NAP proteins from the bacterial lysate. The beads were washed and then incubated with bacterial lysates containing "prey" His-tag NAP proteins. After washes, beads were then eluted by GST elution buffer, and the sample was collected. Reciprocal pulldowns of His-NAP12 (expected size, 35 kDa) using "bait" GST-NAP07 (expected size, 54 kDa) and pulldown of His-NAP07 (expected size, 32 kDa) using "bait" GST-NAP12 (expected size, 57 kDa) were observed, the pulldown of each "prey" protein is indicated by an asterisk. The beads-only and negative input controls worked as expected, as there is no pulldown of any "prey" proteins. The control GST-tag alone does not pulldown either of the His-tagged NAP proteins; using (Continued on next page) RISC Components in E. histolytica EhAgo2-2 RISC identified protein-protein interacting groups within the Ago RISC. Our work is the first to elucidate Ago RISC components in the protozoan parasite Entamoeba. We also defined the interactome of these Ago RISC components and built a map of protein-protein interactions for selected RISC components in relation to Ago.
Ago and its RISC components often display great versatility depending on the Ago type, silencing mechanism, or specific RNAi pathways (50,51). Among all Ago RISCs studied so far, T. gondii Ago is the only parasitic Ago protein that has been subjected to RISC analysis. Much similarity was drawn between TgAgo and human Ago2 RISC contents, possibly due to the presence of an active miRNA pathway in both organisms (23). Among the 43 protein hits identified as EhAgo2-2 RISC components, most do not share homology to known RNAi factors found in other/model systems. E. histolytica appears to lack a canonical Dicer protein; the only RNase III domain protein (EHI_068740) was previously shown to result in partial cleavage of dsRNA under in vitro conditions. However, it was not present in our EhAgo2-2 RISC data set. It remains to be seen if this RNase III domain protein is a necessary factor for RNAi in the parasite. Our previous work revealed all three EhAgos bind to sRNA with a 59-polyphosphate structure, indicating these sRNAs could be generated by RdRP in this parasite (28). In this study, we included IP mass spectrometry analysis for two EhRdRPs and found many shared protein hits that also overlap EhAgo2-2 RISC content. It is possible that Ago and RdRP protein complexes orchestrate a coordinated way to effectively carry out RNAi in E. histolytica.
Eukaryotic Ago proteins are grouped into three clades: Ago-like, Piwi-like, and WAGO subfamilies. The three EhAgo proteins are related to the PIWI-like clade by sequence alignment analysis (2,52). Several studies of Piwi RISC showed no interaction with Dicer or TRBP. However, Piwi RISC interacted with cofactors such as HP1 (heterochromatic protein 1a in flies) (19), rRecQ1 (a DNA nuclease in rats) (18), PRMT5 (an arginine methyltransferase in mice), and a Tudor domain protein (14). The two potential homologs, RecQ family helicase and Tudor-like protein, were identified in EhAgo2-2 RISC, suggesting a similar functional role may also occur and be required for RNAi in this parasite. Future studies geared toward understanding the exact function of these two homologs in E. histolytica could be an exciting avenue to pursue.
We have provided the first account of studying Ago RISC in Entamoeba parasites. Our data revealed 43 protein hits with functions covering transcriptional and translational regulation as well as sRNA loading and other cellular functions. This finding suggests multiple cellular processes may be regulated or involved in regulating RNAi in E. histolytica. Future studies of EhAgo RISC components and their interactomes will be crucial to understanding the regulatory mechanisms and biological implications of RNAi in this parasite. Our findings add to the understanding of this basic biological process and expand the current knowledge on RNAi RISC to a deep-branching single-celled eukaryote.

MATERIALS AND METHODS
Parasite culture, plasmids and cell lines. The E. histolytica trophozoites (HM-1:IMSS) were grown axenically under standard conditions as described previously (28,53). All plasmids used in parasites are based on the pKT3M plasmid, which overexpresses a protein of interest with an N-terminal Myc tag under the control of a cysteine synthase (CS) promoter. The plasmids for overexpressing wild-type EhAgo2-2 and its two mutants are pKT3M-EhAgo2-2, pKT3M-EhAgo2-2 DNLS-DR , and pKT3M-DHFR-EhAgo2-2 PAZ-mut , respectively. These plasmids were generated in our previous studies (28,30) and retransfected for use in this study; pKT-CS-Luc overexpresses a full-length luciferase under the same CS

FIG 6 Legend (Continued)
His-tagged ERM-BP (an irrelevant amebic protein [49]), all "bait" proteins of GST-tagged NAPs showed no reactivity. All controls are shown in Fig. S7C. (B) A physical interaction exists between EhAgo2-2 and recombinant NAP proteins by in vitro pulldown of amebic cell lysate. GST-NAP07 and GST-NAP12 were first immobilized onto glutathione-agarose beads as "bait" proteins. Beads preloaded with "bait" proteins were incubated with cell lysates prepared from parasites overexpressing Myc-EhAgo2-2. After incubation and washing, the beads were eluted and samples were analyzed. Western blotting using mouse anti-Myc and mouse anti-GST antibodies was also performed. There is a noticeable Myc-EhAgo2-2 signal in samples containing NAP "baits" (the rightmost three lanes). Two bands are usually observed due to a degraded protein form of Myc-Ago2-2. No signal corresponding to Myc-EhAgo2-2 was observed in the beads-only or GST controls. Arrows are pointing to the detected proteins under each condition.
Zhang et al.
promoter but lacks a Myc tag. In addition, we selected two EhRdRP genes (EHI_139420 and EHI_179800) and five genes of RISC components (EHI_072030, EHI_120340, EHI_121780, EHI_050550, and EHI_005150) for further characterization. We cloned each full-length gene into pKT3M, and all plasmid constructs were verified by DNA sequencing. Following transfection, stable cell lines were generated and maintained with 6 mg/ml G418. For bacterial expression of recombinant proteins, two NAP genes (EHI_072030 and EHI_120340) were amplified and cloned in-frame into pGEX-4T-1 (GST tag) using EcoRI and XhoI sites and pET-28b (His tag) using Eco53kI and XhoI sites. The resulting plasmids for N-terminal GST tagging are named pGEX-NAP07 and pGEX-NAP12; plasmids for N-terminal His tags are named pET-NAP07 and pET-NAP12. All primers for cloning used in this study can be found in Table S3 in the supplemental material.
Parasite cell lysate and IP. We used a similar protocol as that described in reference 54, with minor modifications. Two T25 flasks of confluent parasite culture were lysed in 3 ml lysis buffer ( Mass spectrometry sample preparation and data analysis. Boiled IP samples were loaded onto a 4% to 12% SDS-PAGE gel (Thermo Fisher Scientific) and separated at 110 V for 8 min to get a 1.0-cm sample size. The sample lanes were cut and fixed at room temperature for 2 h and submitted to a mass spectrometry facility (SUMS) for standard LC-MS analysis according to standard procedures. Briefly, the gel was processed into small pieces and reduced with 5 mM DTT and 50 mM ammonium bicarbonate at 55°C for 30 min. The samples were then alkylated using 10 mM acrylamide in 50 mM ammonium bicarbonate for 30 min at room temperature. Overnight trypsin/LysC digestion (Promega) was performed in the presence of 0.02% ProteaseMAX (Promega) at 37°C. The samples were dried and reconstituted in 12.5 ml reconstitution buffer (2% acetonitrile with 0.1% formic acid), and 3 ml was used for the sample injection. LC-MS was performed on an Orbitrap Q-Exactive HFX mass spectrometer (Thermo Fisher Scientific) with liquid chromatography using a NanoACQUITY ultraperformance liquid chromatography (UPLC) system (Waters Corporation). The mass spectrometer was operated in a data-dependent fashion using high-energy collisional dissociation (HCD) fragmentation for MS/MS spectrum generation.
We performed three IP replicates for each sample using LC-MS analysis. Four samples (EhAgo2-2 WT, the two EhAgo2-2 mutants, and Luc control) were used to identify the core RISC components, and a total of 12 mass spectrometry data sets were generated. For data analysis, the mass spectrometry .RAW data files were first analyzed by Byonic v 2.14.27 (Protein Metrics) to identify peptides and search protein hits against the E. histolytica genome. Proteolysis was assumed to be tryptic allowing for N-ragged cleavage with up to two missed cleavage sites. Precursor and fragment mass accuracies were held within 12 ppm. Proteins were held to a false-discovery rate of 1% using standard approaches (36).
For each individual data set, all protein hits were first separated into ribosomal and nonribosomal hits; only nonribosomal hits were analyzed further in this study. Venn diagram analysis was used for three replicates for each sample to create a two-out-of-three list for nonribosomal proteins (i.e., hits are found in two of three replicates). This analysis was applied to all samples (EhAgo2-2 WT, EhAgo2-2 PAZ-mut and EhAgo2-2 DNLS-DR mutants, and Luc control). We then compared Ago (WT or mutants) against the Luc control: protein hits not in the control were considered exclusive, and overlapping hits found in both were further analyzed for enrichment. The average spectral counts were used for enrichment analysis: protein hits with a calculated value in the Ago samples greater than the control ($3) were considered enriched. Thus, two lists of protein hits were generated for each of the three Ago samples (WT and two mutants) as exclusive and enriched lists.
The Byonic output file ranked its protein hits by the log base 10 of the P value, known as the log P value. We used the log P value to rank protein hits into high-, moderate-, and low-confidence levels. Protein hits with a log P value greater than the top decoy ($2) were considered high-confidence hits (confidence level, .99%); protein hits in the range of 0.7 to 2 were considered moderate-confidence hits (confidence levels between 80% and 99%); the rest that did not meet these two criteria were considered low-confidence hits.
For EhAgo2-2 RISC confirmation, we performed mass spectrometry analysis for two EhRdRPs and five protein hits selected from EhAgo2-2 RISC. All seven proteins were Myc tagged and overexpressed in parasites. We performed anti-Myc IP from lysates prepared from these cell lines. The mass spectrometry runs for the EhRdRP data sets were performed in triplicates, and data analysis was similar to the Ago RISC analysis. For the five selected RISC hits, mass spectrometry analysis was performed in duplicates. The criteria for selection included hits in both replicate data sets, and only exclusive hits after comparison with the Luc control were selected. The exclusive hits for each tagged protein sample were compared with the EhAgo2-2 RISC hits for the overlap.
Western blot analysis. A standard Western blotting technique was performed. Briefly, the membranes were blocked using phosphate-buffered saline (PBS) containing 1% casein (Bio-Rad) and then probed with a 1:1,000 dilution of a primary antibody overnight at 4°C. The appropriate secondary antibody was used at a 1:10,000 dilution and incubated at room temperature for 1 h. Signal was detected with ECL1 (GE Healthcare), and the film was developed by a film processor (GE Healthcare). The antibodies used for Western blotting were mouse anti-Myc (Cell Signaling), rabbit anti-actin (Cell Signaling), rabbit anti-GST (Cell Signaling) antibodies and the custom-made EhAgo2-2 polyclonal antibodies in rabbit (28).
Immunofluorescence assay. We followed the same protocol used in references 30 and 31. Briefly, parasites were fixed using methanol-acetone (1:1) and incubated with 1:250 mouse anti-Myc antibody overnight at 4°C. After washing with PBS three times, the slides were incubated with 1:1,000 Alexa 488-conjugated goat antimouse antibody for 1 h. The samples were mounted using Vectashield (Vector Laboratories, Inc.) and sealed with a cover slip. Images were collected using a Leica CTR6000 microscope using a BD CARVII confocal unit.
Expression and purification of recombinant two NAP proteins. For GST-tagged proteins, plasmids pGEX-NAP07 and pGEX-NAP12 were transformed in Escherichia coli BL21(DE3), and the two recombinant proteins (GST-NAP07 and GST-NAP12) were purified using glutathione beads (GE Healthcare). Briefly, overnight cultures were diluted 1:20 into LB and incubated at 37°C until the cell density reached an optical density at 600 nm (OD 600 ) of 0.6 to 0.7. The cultures were induced with 0.5 mM IPTG for 2 to 4 h at room temperature, and cells were pelleted and stored at 220°C. To prepare the lysates, the frozen pellets were resuspended in GST lysis buffer (10 mM Tris [pH 7.5], 150 mM NaCl, 1 mM EDTA) plus 100 mg/ ml lysozyme and incubated on ice for 30 min. Following incubation, 1.0% Triton X-100, 1 mM PMSF, and 1Â Halt protease inhibitor cocktail were added, and the cells were lysed by sonication. After centrifugation at 14,000 rpm for 10 min at 4°C, the supernatants were saved at 220°C. For purification, glutathione beads were equilibrated with GST lysis buffer, and cell lysates were added. The samples were rotated at 4°C for 1 h to allow GST-tagged proteins to bind the glutathione resin. The beads were washed three times (5 min each time) with GST-wash buffer (50 mM Tris-HCl [pH 7.5], 350 mM NaCl, 1 mM PMSF). The bound proteins were eluted with elution buffer (20 mM reduced glutathione, 50 mM Tris-HCl [pH 8.0], 10 mM MgCl 2 , 1 mM PMSF, 1Â Halt protease inhibitor cocktail). Purified protein samples were dialyzed overnight at 4°C in dialysis buffer (5 mM HEPES [pH 7.6], 1 mM DTT, 0.2 mM PMSF, 1 mM EDTA, 10% glycerol) with two changes to remove glutathione.
For His-tagged proteins, plasmids pET-NAP07 and pET-NAP12 were transformed into E. coli BL21 (DE3), and the two His-tagged proteins (His-NAP07 and His-NAP12) were purified using HisPur nickelnitrilotriacetic acid (Ni-NTA) resin (Thermo Fisher Scientific). Similar to the process described above, E. coli was cultured until an OD 600 0.6 to 0.7 and induced with 0.5 mM IPTG for 2 to 4 h at room temperature. The cells were pelleted and stored at 220°C. To prepare the lysates, the frozen pellets were resuspended in His-lysis buffer (PBS containing 10 mM imidazole) plus 100 mg/ml lysozyme and incubated on ice for 30 min. Then, 0.5% NP-40, 1 mM PMSF, and 1Â Halt protease inhibitor cocktail were added, and the cells were lysed by sonication. The lysates were centrifuged, and the supernatants were stored at 220°C. For purification, Ni-NTA resin was equilibrated with His-lysis buffer, and the lysates were added. The samples were rotated at 4°C for 1 h to allow His-tagged proteins to bind the Ni-NTA resin. The beads were washed three times (5 min each) with His-wash buffer (PBS containing 35 mM imidazole and 1 mM PMSF). The bound proteins were eluted with His-elution buffer (PBS containing 250 mM imidazole, 1 mM PMSF, and 1Â Halt protease inhibitor cocktail). Purified protein samples were dialyzed overnight at 4°C with two changes to remove imidazole.
GST pulldown assay. We used GST-tagged proteins to serve as "bait" for capturing putative His-or Myc-tagged "prey" binding partners. All bacterial lysates (GST, GST-NAP07, GST-NAP12, His-NAP07, His-NAP12, and His-ERM-BP) were made according to the detailed methods described in "Expression and purification of recombinant two NAP proteins." For each sample, 25 ml of equilibrated glutathione beads was used, and 750 ml bait lysate was added to immobilize bait proteins. The sample mixture was rotated at 4°C for 1 h, and the beads were washed three times with 700 ml GST-wash buffer (see "Expression and purification of recombinant two NAP proteins"). We added 750 ml bacterial or amebic lysates containing prey proteins to the beads. The samples were rotated at room temperature for 15 min followed by a 1-h rotation at 4°C. The bound complexes were then washed again three times with GST-wash buffer and eluted with 50 ml GST-elution buffer (see "Expression and purification of recombinant two NAP proteins"). The eluted samples were separated by SDS-PAGE and stained by Coomassie blue G-250.
Sequence alignment and phylogenetic analysis. The genome database at https://amoebadb.org/ amoeba/app was used. We searched nucleosome assembly protein domains in the genome and retrieved sequences for alignment analysis using Clustal Omega tool (www.ebi.ac.uk/Tools/msa/ clustalo/), which also shows a phylogenetic tree.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only.