A Deep Exploration of the Transcriptome and “Excretory/Secretory” Proteome of Adult Fascioloides magna

Parasitic liver flukes of the family Fasciolidae are responsible for major socioeconomic losses worldwide. However, at present, knowledge of the fundamental molecular biology of these organisms is scant. Here, we characterize, for the first time, the transcriptome and secreted proteome of the adult stage of the “giant liver fluke,” Fascioloides magna , using Illumina sequencing technology and one-di-mensional SDS-PAGE and OFFGEL protein electrophoresis, respectively. A total of (cid:1) 54,000,000 reads were generated and assembled into (cid:1) 39,000 contiguous sequences (con-tigs); (cid:1) 20,000 peptides were predicted and classified based on homology searches, protein motifs, gene ontology, and biological pathway mapping. From the predicted proteome, 48.1% of proteins could be assigned to 384 biological pathway terms, including “spliceosome,” “RNA transport,” and “endocytosis.” Putative proteins involved in amino acid degradation were most abundant. Of the 835 secreted proteins predicted from the transcriptome of F. magna , 80 were identified in the excretory/secretory products from this parasite. Highly represented were antioxidant proteins, followed by peptidases (particularly cathepsins) and proteins involved in carbohydrate metabolism. The integration of transcriptomic and proteomic datasets generated herein sets the scene for future studies aimed at exploring the potential role(s) that molecules might play at the host– parasite interface and for establishing novel strategies for the treatment or control of parasitic fluke infections.

Parasitic liver flukes (Platyhelminthes: Trematoda) of livestock, such as Fasciola hepatica and Fasciola gigantica, are responsible for major economic losses worldwide, estimated at USD ϳ3 billion, due to morbidity, mortality, and decreased productivity (1)(2)(3). The giant liver fluke, Fascioloides magna, infects a range of wild and domestic ruminants (e.g. cervids and bovids), primarily in North America and Europe (4). The life cycle of F. magna is indirect; eggs are released by mature flukes and excreted in the feces of the mammalian host. In aerated water, miracidia hatch from the eggs and penetrate the body of a susceptible, aquatic intermediate snail host (e.g. Fossaria parva or Fo. modicella in North America; Galba truncatula in Europe) within ϳ2 h (5). In the intermediate host, the parasite develops through the stages of sporocysts, rediae, and cercariae; the latter larval stage emerges from the snail within ϳ40 -58 days (4). The cercariae encyst as metacercariae on submerged or emergent vegetation and are then ingested by a mammalian host. Once in the host, the metacercariae excyst and the juvenile flukes penetrate the intestinal walls and migrate to the liver, where they are encapsulated (in pairs) in pseudocysts formed by the hepatic parenchyma and then develop into mature flukes (4). The migration of the immature stages through the hepatic tissues, together with the large number of pseudocysts and the size of mature flukes (up to 8 cm in length (4)), can result in liver fibrosis. Clinical signs associated with infection by F. magna include lethargy, anorexia, depression, and weight loss, with sudden death occurring in heavily infected animals (4).
In livestock, the control of liver fluke infections has relied predominantly on treatment with anthelmintic drugs, such as closantel, oxyclozanide, and triclabendazole (6). Triclabendazole is considered the drug of choice against both juvenile and adult stages of liver flukes in the definitive (mammalian) host, whereas other compounds affect only the adult stage (7,8). Thus, triclabendazole is widely and often excessively used for the treatment of trematodiases in livestock (6,9), and this leads to a significant risk that drug resistance will develop. Indeed, there are recently published reports of triclabendazole resistance in Fa. hepatica populations in Australia (10) and in Western European countries (11)(12)(13)(14)(15). In addition, despite major efforts in studies aimed at developing novel intervention strategies against liver flukes (16 -23), there is still a paucity of information on host-parasite interactions at the molecular level. Recent studies (3,24,25) have provided the first insights into the molecular biology of fasciolids by exploring the transcriptomes of the adult stages of both Fa. hepatica and Fa. gigantica (3,25) and of the composition of the excretory/ secretory (ES) 1 products from both the juvenile and adult stages of Fa. hepatica (24). In these studies, proteolytic enzymes (e.g. cathepsins, asparaginyl endopeptidase cysteine proteases, and trypsin-like serine proteases) were identified as key molecules in both the transcriptome and ES products, which are likely to play key roles in parasite migration through tissues and in the modulation of immune responses in the mammalian host. Given the potential of proteolytic enzymes as vaccine candidates against trematodiases (18 -20, 26), comparative analyses of the transcriptomes and ES products of liver flukes is of critical importance for an improved understanding of their molecular biology, as well as for developing new treatment and control strategies against them.
In the present study we explored, for the first time on a large scale, the transcriptome of adult F. magna, using Illuminabased sequencing technology and bioinformatic analyses of sequence data, and we characterized the protein components of the ES products from this developmental stage. This insight into the molecular biology of F. magna offers unprecedented opportunities for comparative investigations of various economically important liver flukes and the design of new interventions against these parasites.

EXPERIMENTAL PROCEDURES
Procurement of Parasite Material, RNA Isolation, and Illumina Sequencing-Adult F. magna were collected from naturally infected livers of red deer (Cervus elaphus) (from the Brdy mountains, Czech Republic) and washed in 0.1 M phosphate-buffered saline (PBS), pH 7.2, at 37°C. Total RNA was extracted from three individual adults of F. magna using TriPure reagent (Roche, Switzerland) and DNase I-treated (25). RNA amounts were estimated spectrophotometrically (NanoDrop Technologies, USA), and RNA integrity was verified using a 2100 BioAnalyser (Agilent, Santa Clara, CA). Polyadenylated (poly(A)ϩ) RNA was purified from 10 g of total RNA using Sera-mag oligo(dT) beads, fragmented to a length of 100 to 500 bp, reversetranscribed using random hexamers, and end-repaired and adaptorligated according to the manufacturer's protocol (Illumina). Ligated products of ϳ300 bp were excised from agarose and PCR-amplified (15 cycles) (25). Products were cleaned using a MinElute column (Qiagen, Netherlands) and paired-end sequenced on a Genome Analyzer II (Illumina) (33), according to the manufacturer's protocols.
Bioinformatic Analyses of Transcriptomic Sequence Data-The 100 bp single-read sequences generated from the non-normalized cDNA library representing the adult stage of F. magna were assembled using the program Oases v0.1.22 (http://www.ebi.ac.uk/ϳzerbino/ oases/; (34)). Adapter sequences and sequences with suboptimal read quality (i.e. PHRED score Ͻ 32.0) were eliminated. The remaining sequences (99%) were used to construct a de Bruijn graph using a k-mer value of 43 bp. Because de novo transcriptome assemblies of short-read sequence libraries can potentially lead to an underrepresentation of members of large families of molecules characterized by a high degree of sequence similarity, such as the cathepsins (25), reads with a high degree of sequence homology (98% nucleotide identity) to the conserved cathepsin propeptide inhibitor domain (I29; 177 nucleotides, nt) of a cathepsin L sequence from F. magna (Gen-Bank protein accession number ACG50798; nucleotide EU877764.1) were used to generate contigs from the paired-end sequence datasets employing the PRICE program (http://derisilab.ucsf.edu). Eight cycles of contig extension were performed, with each new cycle using contigs (Ͼ500 nt in length) generated from the previous assembly as a template. The identity of the assembled contigs was then verified via BLASTn and BLASTx comparisons with nucleotide and amino acid sequences of trematode cathepsins, respectively, available from public databases (e.g. www.ncbi.nlm.nih.gov). The raw sequence reads from the cDNA library were then mapped to the nonredundant sequence data using the program SOAP2 (35). In brief, raw reads were aligned to the assembled, nonredundant transcriptomic data, such that each read was mapped to a unique transcript. Reads that mapped to more than one transcript (called "multi-reads") were randomly allocated to a unique transcript, such that they were recorded only once. To provide a relative assessment of transcript abundance, the numbers of raw reads that mapped to individual contigs were normalized for sequence length (i.e. reads per kilobase per million reads (36)).
Isolation of ES Products-Adult F. magna (n ϭ 15) were washed three times in large volumes of PBS and incubated for 6 h at 37°C in sterile RPMI 1640 medium (Sigma) containing 100 IU/ml penicillin G sodium, 100 g/ml streptomycin sulfate, and 250 g/ml amphotericin B (Sigma-Aldrich). Following incubation, the supernatant was centrifuged twice at 4500 g for 20 min at 2°C, and the ES products were concentrated using Amicon Ultra-15 filters (10 kDa cut-off; Millipore, USA). Protein amounts were measured using the Quant-iT Protein Assay Kit (Invitrogen); 1 ml of ES product (concentration ϭ 1 mg/ml) was lyophilized and stored at Ϫ20°C until analysis.
Electrophoresis and In-gel Digestion-Two 10 l aliquots of F. magna ES products (9 mg/ml) were incubated for 2 h at 37°C with an equal volume of sample buffer (51). The samples were then applied to a 0.75-mm-thick, 4% stacking, and 12.5% resolving gel (prepared using the Bio-Rad PROTEAN 3) for SDS-PAGE (51). Electrophoresis was carried out using maxima of 40 mA/gel and 300 V. Proteins were stained using Coomassie Brilliant Blue and destained in 50:8.75:41.25 methanol:acetic acid:water (v:v:v). Lanes containing ES proteins were finely sliced. The gel slices were then destained twice in 200 l of 50% acetonitrile and 25 mM NH 4 HCO 3 for 15 min at 37°C, desiccated using a vacuum centrifuge, resuspended in 20 mM dithiothreitol (DTT), and reduced for 1 h at 65°C. The DTT was then removed, and the samples were alkylated by adding 1 M iodoacetamide (IAA) to a final concentration of 50 mM and incubated at 22°C in darkness for 40 min. Gel slices were washed three times for 45 min in 25 mM NH 4 HCO 3 and then desiccated. Individual dried slices were then allowed to swell in 20 l of 40 mM NH 4 HCO 3 and 10% acetonitrile containing 20 g/ml trypsin (Sigma) for 1 h at 22°C. An additional 50 l of the same solution was added to the samples, and they were then incubated overnight at 37°C. The supernatant was then removed from the gel slices, and residual peptides were washed from the slices by incubating them three times in 0.1% trifluoroacetic acid for 45 min at 37°C. The original supernatant and extracts were combined and reduced to 10 l in a vacuum centrifuge before mass spectral analysis.
OFFGEL Electrophoresis-For peptide separation, based on the isoelectric point, 240 g of ES products were reduced by adding DTT to 20 mM and 10% SDS to 2% (w/v) and incubating at 65°C for 1 h. Alkylation was then achieved by adding IAA to 50 mM and incubating the solution for 40 min in darkness at 22°C. The sample was then co-precipitated with 1 l of a 1 g/l solution of trypsin by adding nine volumes of methanol at Ϫ20°C. After being incubated overnight at Ϫ20°C, the sample was resuspended in 25 mM NH 4 HCO 3 and incubated at 37°C for 5 h with the addition of 1 g of trypsin after 3 h. The 3100 OFFGEL fractionator and OFFGEL kit (pH 3-10; 24-well format) (Agilent Technologies) were prepared according to the manufacturer's protocols. The ES product digests and undigested proteins were diluted in the peptide-and protein-focusing buffers, respectively, to a final volume of 3.6 ml, and 150 l of sample was loaded into individual wells. The samples were focused in a maximum current of 50 A until 50 kilovolt hours were reached. Peptide fractions were harvested and desiccated using a vacuum centrifuge, resuspended in 25 mM NH 4 HCO 3 , and desiccated again before mass spectral analysis.
Protein Identification Using LC-MS/MS-MS-OFF-GEL electrophoresis (OGE) fractions and tryptic fragments from in-gel digests were separated chromatographically via HPLC (Dionex Ultimate 3000) using a Zorbax 300SB-C18 column (3.5 m, 150 mm ϫ 75 m; Agilent Technologies) and a linear gradient of 0% to 80% solvent B for 60 min. A preconcentration step (3 min) was performed employing a Dionex -Precolumn cartridge (C18, 5 m, 300 m ϫ 5 mm) before commencement of the gradient. A flow rate of 300 nl/min was used for all experiments. The mobile phase consisted of solvent A (0.1% formic acid (aq)) and solvent B (80/20 acetonitrile/0.1% formic acid (aq)). Eluates from the RP-HPLC column were directly introduced into the NanoSpray II ionization source of a QSTAR Elite Hybrid MS/MS System (Applied Biosystems, USA) operated in positive ion electrospray mode. All analyses were performed using information dependant acquisition. Analyst 2.0 (Applied Biosystems) was used for data analysis. Briefly, the acquisition protocol consisted of the use of an enhanced mass spectrum scan. The three most abundant ions detected (above background) were subjected to examination using an enhanced resolution scan to confirm the charge state of the multiply charged ions. The ions with a charge state of ϩ2 or ϩ3 and those with an unknown charge were then subjected to collision-induced dissociation using a rolling collision energy, dependent upon the m/z and the charge state of the ion. Enhanced product ion scans were acquired, resulting in full product ion spectra for each of the selected precursors, which were then used for subsequent database searches.
Bioinformatic Analyses of Proteomic Sequence Data-Protein sequences from mass spectrometric analyses were compared to peptide sequences predicted from transcriptomic data and to proteins available from the SwissProt database (May 2011) using Mascot v.2.3.02 (http://www.matrixscience.com), employing the following search parameters: enzyme ϭ trypsin; precursor ion mass tolerance ϭ Ϯ0.1; fixed modifications ϭ methionine oxidation; variable modifications ϭ carbamidomethylation; number of missed cleavages allowed ϭ 2; charge states ϭ ϩ2 and ϩ3. The results from the Mascot searches were validated using the program Scaffold v.3.00.06 (Proteome Software Inc., USA) (52). Peptides and proteins were identified using the Peptide Prophet algorithm (53), using a probability cutoff of 95.0% (peptides) or 99.0% probability (proteins), and contained at least two identified peptides (proteins) (54). Proteins containing similar peptides that could not be differentiated based on MS/MS analysis were grouped to satisfy the principles of parsimony. A false discovery rate of 0.1% was calculated using protein identifications validated using the Scaffold program. Relative protein abundance was determined by means of normalized spectral counting (52), and proteins encoded by transcripts were annotated by BLASTx comparisons (bit-score cutoff: Ͼ30) with data available in the nonredundant protein databases (www.ncbi.nlm.nih.gov). Proteins were also classified according to GO categories using the program InterProScan (41), and putative signal peptides and transmembrane domain(s) were predicted using the programs SignalP (55) and TMHMM (47). Putative mannose 6-phosphate glycosylation sites were identified using the NetNGlyc server (56).
Proteomic Analysis of ES Proteins-F. magna ES proteins were subjected to SDS-PAGE, and the protein constituents were identified using tandem mass spectroscopy. More than 37,000 spectra from in-gel digests and OGE analysis were used for Mascot searches of the transcriptome. Using Scaffold, 80 proteins were identified by at least two peptides at a 99.0% probability and with an estimated false discovery rate of 0.1% (supplemental Tables S3 and S4). Analysis of SDS-PAGE gels yielded 79 protein identifications, with OGE contributing to one additional identification (supplemental Table  S4). The proteins identified were assigned to functional categories on the basis of InterPro domains and/or GO categories. The largest functional group represented antioxidant proteins, followed by peptidases and proteins linked to carbohydrate metabolism; approximately one-quarter of the proteins identified could not be assigned to a major functional group, and they were thus grouped into a "miscellaneous" group (Fig.  3A). In accordance with respective functional categories, GO terms highly represented among the proteins included "cysteine-type peptidase activity," "oxidoreductase activity," "serine-type endopeptidase inhibitor activity," and "carbohydrate metabolic process." Using the spectral counting method included in Scaffold, the cysteine protease inhibitor cystatin was found to be the most abundant protein in the ES products, followed by two cathepsin L1 proteases, cathepsin B, calpain, and ferritin. Cathepsin L was the most abundant proteinase identified in the ES products, with four different isoforms identified, as well as a cathepsin L-like protease. Similarly, multiple isoforms of cathepsin B2 and fatty-acidbinding protein were identified. In all cases, isoforms were reported only if at least one unique peptide (which scored significantly) was assigned to the isoform (see Fig. 4). Based on SignalP and TMHMM analyses (47,55), ϳ30% of the identified proteins were inferred to contain a classical secretory signal peptide and represented mainly proteases and proteins associated with carbohydrate metabolism. Six proteins were predicted to contain more than two transmembrane helices, indicative of a membrane-bound protein. Protein localization was predicted using PSORT (62). The majority of ES products were predicted to be either extracellular (ϭ 27) or cytoplasmic (ϭ 39), with smaller numbers inferred to be localized or co-localized to the nucleus, plasma membrane, and/or mitochondria (Fig. 3B). Twelve proteins were assigned a lysosomal localization; to further validate this finding, all proteins with a predicted signal sequence were analyzed for the presence of potential mannose 6-phosphate glycosylation sites. Of the 23 proteins with predicted signal peptides, 10 proteins had a glycosylation probability of Ͼ0.75, and 6 had a probability of Ͼ0.50. Proteins inferred to be glycosylated included known lysosomal proteins, such as Pro-Xaa carboxypeptidase, alphamannosidase, alpha-glucosidase, and cathepsins A and B2. 2

Transcriptome and Secreted Proteome of Fascioloides magna
Proteases represented a significant proportion (ϳ16%) of the proteins identified in the ES products. Thirteen proteases were identified in these products: nine cysteine proteases, including cathepsin B and L, calpain, and legumain; three carboxypeptidases, including cathepsin A; and a leucine aminopeptidase (Fig. 3C). Cathepsins were the most represented proteins, with examples of cathepsins A, B, and L identified. Based on BLAST-comparisons, all cathepsin L proteins identified in the ES products shared significant homology with Fa. hepatica or F. magna cathepsin L1, in accordance with the results from the analysis of the F. magna transcriptome (see Fig. 2). In addition to cathepsins identified by proteomic means, cathepsins D and L3 and a cathepsin B-like protease, the corresponding transcripts of which were detected in the F. magna transcriptome, were not found in the ES proteome. The cathepsins identified correlated with the most abundantly expressed transcripts, suggesting that unidentified proteases were expressed at levels below the detection limit of the mass spectrometric method used (Table II). Analysis of the peptides detected in Mascot searches showed that only one peptide representing cathepsin L (ID: 39933) (see Fig. 4) originated from the pro-region of the protease, indicating that these proteases were predominantly present in an activated form. Conversely, three pro-region peptides were identified from one of the cathepsin B2 isoforms (ID: 39926) (supplemental Fig. S1), suggesting the presence of inactive forms of this protease.
The ES products of F. magna were compared with those characterized in previous studies of Fa. hepatica (24,(63)(64)(65)(66), C. sinensis (67), and O. viverrini (68). Approximately 65% of the proteins identified here were also detected in at least one of these flukes, but only enolase, actin, and triose-phosphate isomerase were identified in all three species. The ES proteomes of F. magna and Fa. hepatica were most similar, with 47 of 80 proteins identified here found in both species; a major difference between these proteomes was the identification of five isoforms of cathepsin L1 in F. magna and ten in Fa. hepatica. The 27 proteins unique to F. magna were less abundant and membrane-bound and/or structural, although eight of them contained a secretory signal peptide. Apart from the difference in cathepsin L1 isoform numbers, the protease profile of F. magna differed from that of Fa. hepatica only by the presence of cathepsin A. However, the protease profiles of both F. magna and Fa. hepatica (Fasciolidae) were markedly different from those of C. sinsensis and O. viverrini (Opisthorchiidae) (67,68), with only cathepsin B2 and legumain detected in the ES products of both families (Figs. 3C and 3D). DISCUSSION The present study provides a comprehensive snapshot of the transcriptome and the ES proteome of the adult stage of F. magna and represents an invaluable resource for fundamental investigations of the molecular biology of liver flukes that is of both veterinary and public health importance. Almost two-thirds of F. magna transcripts sequenced in the present study were similar to molecules identified in the transcriptomes of Fa. hepatica and Fa. gigantica, respectively (3,25), which is indicative of the biological similarities among members of the family Fasciolidae. Most abundant in the transcriptome of F. magna were molecules containing a predicted signal peptide; this finding is in accordance with the results of a previous study of the transcriptome of Fa. gigantica (25) and likely reflects the crucial role(s) that secreted proteins play in the biology of these organisms (69). Of the 835 predicted proteins with a signal peptide in F. magna, 80 were identified FIG. 4. Cathepsin L1. Sequence alignment of four cathepsin L1 isoforms and a cathepsin L-like protease (39930) identified in the excretory/secretory (ES) products of Fascioloides magna. Peptides (assigned using Mascot) representing each isoform or multiple isoforms are highlighted in green and red, respectively. Each amino acid sequence corresponding to the mature protein is delimited by a solid black line.
in the ES products from this organism at a confidence level of 99%, in accordance with previous proteomic analyses of the ES products of other liver flukes (i.e. 20 to 90 proteins identified) (24, 66 -68). This finding suggests that, in addition to peptides that were undetectable because of their low molecular weight and concentration and/or possible "endogenous" secretion, most proteins in ES products of F. magna were identified. In a previous study of the transcriptome and secreted proteome of Fa. hepatica, Robinson et al. (24) also observed a discrepancy between the number of secreted proteins predicted from transcriptomic data and that of proteins identified in the ES products from this species. In the present study, a total of 42 transcripts encoding cathepsins (e.g. cathepsins B and L) were detected in the transcriptome of F. magna, of which 7 were predicted to contain a signal peptide indicative of secretion (see supplemental Table S2), and 8 were identified in the ES products (see Fig. 3C). A possible explanation for the absence of a predicted signal peptide in transcripts encoding proteins identified in the ES products is that some members of the cathepsin family might be excreted/secreted via a "nonclassical" pathway that does not involve signal peptide cleavage (70). Conversely, the difference between the number of cathepsin-encoding transcripts predicted from transcriptomic data and the number of cathepsins identified in the ES products supports the hypothesis that the transcriptome of fasciolids encodes "endogenous" cathepsins that function in key biological pathways, such as egg production, protein turnover, and molecular remodelling (71), and "exogenous" cathepsins with roles that appear to relate to the digestion of host molecules (72,73).
In parasitic trematodes, some cathepsins (e.g. cathepsins B and L) are known to be encoded by multigene families (74), which poses a challenge for both the de novo assembly of transcripts encoding different, particularly closely related isoforms and the identification of such isoforms using mass spectroscopy. The accurate characterization of isoforms via spectrometry is highly dependent on their sequence similarity and their abundance in the matrix subjected to analysis (in this case, ES products). Thus, it is possible that the high sequence similarity of the cathepsin L proteins identified in the transcriptome impaired mass spectral identification of low-abundant cathepsins L, which, in some cases, relied on the specific detection of variation of only two or three peptides. The presence of multiple isoforms of secreted cathepsin L has been reported previously for both Fa. hepatica (24,66) and Fa. gigantica (25). In Fa. hepatica, these proteins have been shown to be crucial for parasite survival, mediating essential processes such as the digestion of host macromolecules and the suppression of the host immune response (20). In Fa. hepatica, 24 different cathepsin L isoforms have been identified and classified into five clades (designated Clades 1-5; FhCL1-5) (65); members of Clades 1, 2, and 5 were shown to be present among ES products from adult worms, whereas members of Clades 3 and 4 were detected exclusively in ES products from juvenile worms, suggesting that different isoforms play distinct roles in molecular mechanisms linked to the invasion of and survival in the vertebrate host (65). The expansion of the cathepsin family of peptidases in Fa. hepatica is the result of a series of gene duplication events (65,75). The sequence diversity displayed by members of the cathepsin protein family, together with their broad range of substrate specificities, has been hypothesized to play a role in the ability of parasitic trematodes to infect a wide range of mammalian hosts (20,75,76). In the present study, transcripts encoding cathepsins with significant homology to FhCL3 were identified in the transcriptome of adult F. magna; however, the corresponding proteins were not detected in the ES products of this trematode by proteomic means. This finding contrasts with a previous analysis of Fa. hepatica, in which molecules encoding FhCL3 were not detected in the transcriptome of the adult worm (65). Five distinct cathepsin L isoforms were identified in the ES proteome of F. magna, including four isoforms and a cathepsin L-like protease, all significant homologues of FhCL1. However, some S2 active site residues of the F. magna cathepsin L (see Fig. 4) peptidases were similar to those of cathepsins expressed specifically by the juvenile stages of Fa. hepatica (65). The developmental regulation of the expression of cathepsin L is likely to represent a potential adaptation of different developmental stages of the parasite to the diverse environmental conditions encountered throughout its life cycle. Based on these observations, it is tempting to speculate that if developmental regulation of cathepsin L peptidases occurs in F. magna, it will differ from that in Fa. hepatica. Given the potential of cathepsins as vaccine targets against fascioliasis (20), further studies aimed at elucidating the expression profiles of cathepsin-encoding transcripts in different stages of the F. magna life cycle, as well as the presence of members of this protein family in the ES products of the juvenile stages, will be crucial.
In addition to the cathepsin L isoforms, six other proteases were detected in the ES products from F. magna, namely, two isoforms of cathepsin B2, a cathepsin A, a legumain, two isoforms of lysosomal Pro-Xaa carboxypeptidase and a leu-    . 3C). Like cathepsin L, cathepsin B is thought to be expressed as an inactive zymogen and is activated following the removal of a pre-pro-region by an asparaginyl endopeptidase (legumain) (77), also present in the ES products from F. magna. In the present study, predominantly activated forms of cathepsin L were identified, as evidenced by the lack of peptides assigned to the pre-proregion. In a single case, the amino acid sequence of one of the cathepsin B isoforms identified included three pre-pro-region peptides, typical of the inactive form of this protein; however, each of the peptides from this region was represented by only one spectrum, whereas spectral counts of peptides from the mature sequence were higher (e.g. up to 23 for one of the peptides). Therefore, it is likely that inactivated proteases constituted a small proportion of the total number of cathepsins B detected in the ES products. In addition, for the second isoform of cathepsin B2 identified herein (Table III), the Asn in the region of the amino acid sequence, which precedes the first residue of the mature protein (typical of all cathepsin L peptidases from Fa. hepatica and proposed to act as a substrate for possible activation of these proteins by legumain proteases) (77), was substituted for by a Glu residue. The same substitution was observed in three of the cathepsins L of F. magna (Fig. 4). Several of the proteases identified in the F. magna ES products were of lysosomal origin, including cathepsin A, the two Pro-Xaa carboxypeptidases, and, possibly, legumain (see Fig. 3C). On the basis of PSORT and the presence of mannose 6-phosphate glycosylation sites, 12 lysosomal proteins were identified in the F. magna ES products, which constituted ϳ15% of the total identifications; after cystatin and the cathepsin proteases, lysosomal proteins were most abundant in the ES products. Recent proteomic studies of S. mansoni (blood fluke) and Fa. hepatica vomitus also identified a number of lysosomal proteins as enzymes putatively involved in digestive processes (66,78). In S. mansoni, this observation led to the hypothesis that lysosomal proteases might be actively secreted into the gut lumen, the pH of which facilitates the activity of these proteins, in order to  Cy  ----24853  2  11 Unknown  --Cy  ----21118  2  32 Unknown --Cy/Ex ----SC, number of unique peptides assigned to the protein; CO, percent cover; LO, putative subcellular localization of the protein; TM, number of transmembrane domains. The presence of a signal peptide (SP) or an N-linked glycosylation site (N) is denoted by "ϩ"; the presence of the protein in the ES of adult Fasciola hepatica (Fh), Clonorchis sinensis (Cs), or Opisthorchis viverrini (Ov) is also denoted by "ϩ." enable the digestion of plasma components as well as hemoglobin (78). A legumain, a pro-X carboxylpeptidase, a betaglucosidase, several isoforms of ferritin, and the Niemann-Pick C2 protein were also identified here as constituents of the ES products from F. magna. The presence of these proteins in the F. magna ES products, as well as serpin and kunitz-type protease inhibitors, is likely a consequence of the fact that, in vitro, F. magna readily regurgitates the contents of its digestive tract into the culture medium, as previously observed for Fa. hepatica (66). Indeed, the same proteins were also identified in vomitus from both of the latter trematodes and S. mansoni (66,78).
The vast majority of F. magna ES products represented a complex mixture of proteins, predominantly of extracellular or cytoplasmic origin. In particular, a significant proportion of nonclassically secreted ES proteins, including a number of proteins predicted to be membrane bound, such as a tetraspanin and a glucose transporter, were identified. Unlike ES products from gastrointestinal parasitic nematodes, such as Ancylostoma caninum (31), in which classically secreted proteins are very abundant, nonclassically secreted proteins have been consistently shown to represent a significant proportion of ES products from trematode parasites, including S. japonicum (79), C. sinensis (67), O. viverrini (68), and Fa. hepatica (66). For the latter trematode, it has been proposed that the presence of these proteins in the ES products is a result of stress-induced shedding of the tegument during culture (64). However, morphological studies of this parasite have noted rapid turnover of its tegument, facilitated by subtegumental cells, leading to the hypothesis that the shedding of the tegument and associated proteins occurring in vivo might represent an immuno-defensive strategy (24,80). Two phospholipases and an ABC transporter were also identified among the ES products of F. magna. The presence of ABC transporters in the tegumental membrane of Fa. hepatica has prompted comparisons to the ER/Golgi-independent secretion of IL-1␤ and caspase-1 in mammalian cells (24). This process involves the formation of plasma membrane "blebs," mediated by ABC transporters, which are subsequently released as microvesicles after phospholipase-mediated fusion with the plasma membrane (81); the presence of both phospholipases and ABC transporters in the ES products of F. magna suggests that a similar process might occur in this trematode. Although some ES products identified in the present study might relate to gut content regurgitated by the parasite (66), 15 antioxidant proteins were identified, none of which possessed a signal peptide indicative of secretion. A similar profile of ES antioxidant proteins, including fatty-acidbinding proteins, glutathione S-transferase, peroxiredoxin, and thioredoxin, has been described for Fa. hepatica (24). In the latter trematode, these proteins are proposed to play an important role in the evasion of host immune responses, which likely relates to the protection of the parasite from reactive oxygen species released by host immune cells (82)(83)(84)(85), an inhibition of the proliferative potential of spleen cells (e.g. in rats) (86), and/or the recruitment and alternative activation of macrophages (86 -88). All antioxidant proteins identified here, with the exception of dihydrolipoamide dehydrogenase, were also detected in the ES products from adult Fa. hepatica, suggesting that these proteins play a role in immune evasion (82-84, 86 -88). In Fa. hepatica, a variable antioxidant profile was observed in ES products from different developmental stages (24), which led to the hypothesis that these nonclassically secreted antioxidant proteins were transported through an alternative, transtegumental, secretory pathway (24). However, the existence of such a pathway in fasciolid trematodes remains to be demonstrated.
The availability of entire genome sequences of related species of liver flukes, such as C. sinensis (89) and blood flukes (38 -40), now provides unprecedented opportunities to (i) conduct comparative proteomic comparisons; (ii) elucidate the structures and functions of key molecules (e.g. "endogenous" and "exogenous" cathepsins); (iii) explore novel biological pathways (e.g. transtegumental, secretory pathways); and (iv) establish relationships among genes, transcripts, and proteins involved specifically in the parasite's invasion of, establishment in, and interactions with the host. Advancing these areas will provide a basis for studies aimed at exploring the potential of these molecules as targets for the development of novel strategies for the control of trematodiases. □ S This article contains supplemental Figure S1 and Tables S1 to S4.