Circulating Small RNA Profiling of Patients with Alveolar and Cystic Echinococcosis

Simple Summary Infectious diseases are a matter of concern worldwide, as recently evidenced by the COVID-19 pandemic. However, in many instances, pathogens develop slowly, and patients discover they are ill even years after they were infected. This is the case of diseases caused by tapeworm parasites, such as alveolar (AE) and cystic (CE) echinococcosis. Both AE and CE are produced by the growth of parasite larvae in organs of the host, mainly the liver. Despite the life cycles of these pathogens having been elucidated over 100 years ago, current diagnostic techniques cannot determine parasite viability during infection or treatment follow-up. Recently, a novel group of diagnostic molecules, namely small RNAs (sRNAs), have emerged with promising results in several pathologies. sRNAs are short nucleic acids expressed and secreted by cells; they can be detected in fluids such as serum, and their circulating levels are altered during diverse pathological states. Here, we characterized the profile of circulating sRNAs in patients with AE and CE to identify novel biomarkers that may aid in medical decisions. As a result, a panel of 20 candidate markers related to each pathogen and/or liver lesion were identified, which resulted in valuable knowledge to improve the diagnosis of these parasitic diseases. Abstract Alveolar (AE) and cystic (CE) echinococcosis are two parasitic diseases caused by the tapeworms Echinococcus multilocularis and E. granulosus sensu lato (s. l.), respectively. Currently, AE and CE are mainly diagnosed by means of imaging techniques, serology, and clinical and epidemiological data. However, no viability markers that indicate parasite state during infection are available. Extracellular small RNAs (sRNAs) are short non-coding RNAs that can be secreted by cells through association with extracellular vesicles, proteins, or lipoproteins. Circulating sRNAs can show altered expression in pathological states; hence, they are intensively studied as biomarkers for several diseases. Here, we profiled the sRNA transcriptomes of AE and CE patients to identify novel biomarkers to aid in medical decisions when current diagnostic procedures are inconclusive. For this, endogenous and parasitic sRNAs were analyzed by sRNA sequencing in serum from disease negative, positive, and treated patients and patients harboring a non-parasitic lesion. Consequently, 20 differentially expressed sRNAs associated with AE, CE, and/or non-parasitic lesion were identified. Our results represent an in-depth characterization of the effect E. multilocularis and E. granulosus s. l. exert on the extracellular sRNA landscape in human infections and provide a set of novel candidate biomarkers for both AE and CE detection.


Introduction
Alveolar (AE) and cystic (CE) echinococcosis are two parasitic diseases caused by the tapeworms (cestodes) Echinococcus multilocularis and Echinococcus granulosus sensu lato (s. l.), respectively. The pathologies are caused by the metacestode larval stage, which develops primarily in the liver (AE and CE) and lung (CE). The general morphology of the metacestode consists of a fluid-filled bladder, bounded by an inner cellular layer (the germinal layer) and an outer acellular layer (the laminated layer). Even though these pathogens are genetically highly related [1,2], as demonstrated by genetic diversity studies performed at the whole-genome level [1], AE and CE are two different diseases which differ in parasite development and host response [3]. In AE, the parasite in the intermediate host grows in a tumor-like manner due to its capacity to bud exogenously. In addition, the germinative totipotent cells of E. multilocularis can metastasize to distant foci when released into the bloodstream of the host [4,5]. Parasite development may take up to 15 years until patients show symptoms which commonly refer to liver damage (e.g., hepatomegaly, cholestatic jaundice, and liver abscess) [3]. In CE, the metacestode grows concentrically with no exogenous budding, forming unilocular bladders called "hydatid cysts". The host response to active CE cysts involves the formation of an adventitial fibrous capsule that isolates the parasites. Parasite growth may take more than 10 years to produce symptoms [6,7] that relate mainly to the mechanical pressure caused by the parasite in specific organs. In hepatic CE, hepatomegaly, abdominal distension, and jaundice are observed, among others [8].
The life cycles of these pathogens were elucidated more than 100 years ago [9] and both parasites are classified within the priority group of parasitic diseases to be eradicated by 2030 [10], yet there are no viability markers that indicate parasite state during infection and treatment follow-up [11]. The diagnosis of abdominal CE is mainly based on ultrasound (US), which allows the classification of cysts into stages that are crucial for clinical management. In fact, a stage-specific approach to treatment has been endorsed by the WHO Informal Working Group on Echinococcosis (IWGE) [12]. Serology in the diagnosis of CE has an ancillary role, considering that currently available tests show limited sensitivity for young (CE1) and inactive (CE4 and CE5) cysts, due to the seroconversion of patients [13][14][15]. Moreover, the use of serology for the follow-up of patients has proven unreliable, both by using purified and recombinant antigens [16,17]. Furthermore, serological assays are influenced by cyst stage, dimension, and localization [18].
In AE, current radiological methods used for the diagnosis and follow-up of patients rely less on US, often requiring assessment with MRI as well as PET-CT [19][20][21]. While serological assays for AE are more sensitive and specific than those employed for CE [16], all serological assays have been shown to present cross-reactivity with other parasitic diseases and inter-operator variability [22]. Consequently, the diagnosis and follow-up of patients with both diseases is generally carried out in referral centers, with misdiagnosis and mismanagement of patients being frequent in other hospitals. Due to these reasons, the search for viability and early diagnosis markers of AE and CE is still a pending task in the field.
Small RNAs (sRNAs) are short (<200 nt) non-coding RNAs expressed intracellularly, which primarily regulate gene expression, as in the case of microRNAs (miRNAs) and sRNAs derived from tRNAs (tDRs). miRNAs are~22-nt sRNAs expressed in eukaryotes and are mainly involved in the regulation of organismal development; hence, they are dysregulated in multiple pathological states [23]. Eukaryotic pathogens, such as helminths, express a repertoire of miRNAs among which some are not encoded in the genomes of vertebrate hosts or present high sequence divergence with respect to vertebrate ortho-logues. On the other hand, tDRs are 18 to 35-nt sRNAs that can regulate translation, are expressed under cellular stress conditions, and are generated by endonucleolytic cleavage of mature tRNAs [24]. In addition, sRNAs are actively secreted by cells through packaging in extracellular vesicles (EVs) or in association with proteins or lipoproteins [24]. Extracellular sRNAs can then be detected in multiple body fluids, including serum, plasma, and urine [25], and display altered circulating levels in a wide range of diseases [26][27][28], positioning them as novel biomarker candidates for pathological states. Most research in the field of extracellular RNAs has centered on the study of miRNAs carried by EVs. EVs are subcellular particles of varying biogenesis, sizes, morphology, density, and cargo that act as messenger vehicles of the components (proteins, lipids, nucleic acids, sugars) of the secreting cell [29]. EV secretion has been described in eukaryotes and prokaryotes [30], including helminth parasites [31]. In particular, the metacestode stages of Echinococcus spp. produce EV, which can be secreted towards either the inner fluid or the extra-parasite milieu [32][33][34][35][36][37][38][39][40][41]. In E. multilocularis, it was shown that in vitro EVs are mainly secreted towards the inner metacestode fluid due to the physical restriction imposed by the laminated layer [33,34]. Furthermore, we and others have reported that E. multilocularis metacestodes secrete sRNAs in vitro [34,42]. Evidence suggests that this parasite employs both vesicular and non-vesicular pathways to export sRNAs, with a predominance of the non-vesicular route towards the external milieu, i.e., the host [34]. Regarding sRNA secretion in vivo, high throughput profiling of circulating miRNAs was performed in experimental AE [43], experimental CE [44], and human CE [45,46]. However, parasite miRNAs were only studied in the AE experiment. Due to the low sensitivity and specificity of current techniques to accurately determine early infection and parasite viability for post-treatment followup and/or infection status, diagnostic alternatives for both epidemiological studies and individual diagnosis are urgently needed.
In this work, we aimed to characterize the circulating sRNA profiles in the context of both AE and CE to identify novel biomarkers that may aid in medical decisions when imaging and serological diagnosis are not conclusive. Endogenous as well as parasitic sRNAs were studied in comparison with AE and CE negative patients to determine if transcriptional profiles may be useful to differentiate between active versus inactive infections, treatment follow-up, and AE versus CE diagnosis.

Samples
Serum (500 µL) from 3 AE negative, 3 AE positive, and 3 AE positive and patients treated with albendazole were obtained from the Serology Department of the Institute of Hygiene and Microbiology, University of Würzburg, Germany. AE patients were considered positive if a liver lesion was detected by ultrasonography and immunodiagnosis results were positive for E. multilocularis. In contrast, patients were diagnosed as negative when serology test yielded negative results. Diagnosis was performed by hemagglutination test (HAT) (antigens from E. granulosus s. l. (Fumouze)) and enzyme-linked immunosorbent assay (ELISA) (EG55 antigen, i.e., recombinant Ag B, from E. granulosus s. l. [47]; EM10 [47] and total larva antigens from E. multilocularis [48]). Samples were kept at −20 • C. Serum (250 µL) from CE negative and CE positive patients were obtained from Policlinico San Matteo Hospital Foundation, Pavia, Italy, as part of the European Registry of Cystic Echinococcosis (ERCE) [49]. Three samples from each of the following groups were analyzed: CE negative patients with no cysts (Negative), CE negative patients with non-parasitic lesion (NPL), CE positive with active cyst (CE1-2), CE positive with transitional cyst (CE3a and CE3b), CE positive with inactive cyst (CE4, CE5)) that reached inactivation spontaneously, and CE positive with albendazole treatment (treated). NPL patients displayed one hepatic lesion and were not under therapy by the time samples were taken. Patients were considered CE positive or negative according to ultrasonography results. For CE positive, inclusion criteria were presence of a single cyst, located in the liver, with a well-defined CE stage according to the WHO-IWGE classification [12]. The corresponding cyst diameter was calculated taking into consideration the longest axis. In addition, patients were tested for routine diagnostic purposes using ELISA (RIDASCREEN Echinococcus IgG, R-Biopharm, Darmstadt, Germany), following manufacturer's instructions. Optical Density (OD) results were used to calculate and interpret a Sample Index (SI), as per manufacturer's instructions. ELISA results were considered positive for SI ≥ 1.1, negative for SI < 0.9, and border line for 0.9 ≤ SI < 1.1. Borderline results were considered negative. Samples were kept at −80 • C.
2.2. RNA Isolation, Library Construction, and Small RNA Sequencing RNA isolation from 200 µL of serum was performed with miRNeasy Serum/Plasma Kit. Small RNA library construction was performed with the QIAseq®miRNA Library Kit. All procedures were carried out according to the manufacturer's instructions by Qiagen Sequencing Service (Hilden, Germany). All purification steps were performed using beads (no gel size selection). Single-end 150 bp reads were sequenced with a NextSeq500/550 at Qiagen Sequencing Service (Hilden, Germany).
Human miRNAs: human mature and precursor miRNA sequences together with metazoan mature miRNAs were retrieved from the miRBase Sequence Database, Release 22.1 [53]. Only those miRNAs that fulfilled the following criteria were retained: conserved miRNA sequences, significant randfold p-value "yes", ≥10 raw read counts, and expression in at least two samples from one patient group of the set (AE or CE) under analysis. When more than one mapping was reported for the same mature sequence, the miRNA with the higher miRDeep score was selected in case the same number of counts was displayed. Contrarily, the miRNA with the higher number of read counts and/or reads mapping to the star sequence was selected.
Echinococcus spp. miRNAs: Echinococcus spp. mature and precursor miRNA sequences together with metazoan mature miRNAs were retrieved from the miRBase Sequence Database, Release 22.1 [53]. Retained sequences were those mapping to reference sequences with a miRDeep2 score ≥ 4; significant randfold p-value "yes" and no perfect sequence identity with human miRNAs.

Identification of Non-miRNA sRNAs
Processed reads were aligned to the corresponding genome using Bowtie version 1.2.2 (-v 0 -sam -best -time -threads 8) [54]. For sequence annotation, ad hoc non-coding RNA databases were constructed. For the human database, the following types of sequences were downloaded from the RNAcentral sequence database version 18 (https://rnacentral.org/): lncRNA, piRNA, snRNA, snoRNA, SRP RNA, rRNA, Y RNA, antisense RNA, vault RNA, RNAse MRP RNA, scRNA, telomerase RNA, RNAse P RNA, transcript, and intron (filters "Homo sapiens", "No QC warnings", "genomic mapping: available"). Human tRNA sequences were retrieved from GtRNAdb (http://gtrnadb.ucsc.edu/genomes/eukaryota /Hsapi38/) and pre-miRNA sequences from the miRBase database. After a first annotation analysis of both AE and CE sets, piRNA sequences were excluded from the database since reads mapping to this category also mapped to other RNA biotypes, as previously described [25]. For the Echinococcus spp. database, the following types of sequences were retrieved from RNAcentral: tRNAs, snRNA, snoRNA, rRNA, RNAse P, SRP, misc RNA, and mit RNA. Pre-miRNA sequences were downloaded from the miRBase database.

Expression and Correlation Analyses
Differential expression of sRNAs grouped by RNA biotype was performed with the Kruskal Wallis test followed by Dunn's test for multiple comparisons using the Negative group of the corresponding set of patients (AE or CE) as control.
DESeq2 was employed for Principal Component Analysis (PCA) and differential expression analysis of individual sRNAs using vst-normalized data [56]. PCA was performed with the 50 most variable genes. Fold changes from each patient group were calculated with respect to the Negative group of the corresponding set (AE or CE). DESeq2 has been proved to perform correctly when using a small number of replicates [57].
Graphs were performed with the software GraphPad Prism 8.0.1. Upset plots were performed with R Studio version 2022.07.2 Build 576.

Characteristics of Patients
In this work, the sRNA profiles of serum samples from AE and CE patients were analyzed in comparison to samples from negative patients. In all cases, parasites were in the liver and in the case of CE patients, only one cyst was detected by ultrasonography. Furthermore, samples from patients with a single hepatic non-parasitic lesion were included to compare the extracellular sRNA profile in the presence of lesions entering differential diagnosis with CE cysts. Tables 1 and 2 summarize data from patients.

Overall Sequencing Results
A total of 33 samples were analyzed, 9 for AE (Table 1) and 24 for CE ( Table 2). Since E. multilocularis and E. granulosus s. l. samples were collected at different institutions following alternate protocols, we continued with their analysis separately. Mean processed reads were ≥8.5 × 10 6 (±1.3 × 10 6 ) and 10.7 × 10 6 (±1.5 × 10 6 ) per group in AE and CE sets, respectively (Supplementary Table S1). Before proceeding to perform the sRNA profiling, the maximum number of mismatches to each reference genome was determined to allow the most specific and sensitive mapping pipeline. Due to the short length of the reads, sequences generated from highly conserved regions are likely to map to host and parasite genomes (ambiguous sequences); however, since the Echinococcus genomes still have a low quality compared to the human genome, a highly stringent mapping pipeline may exclude bona fide parasite sequences. Thus, reads from negative samples (patients with no detectable echinococcosis) were aligned to both the human and Echinococcus spp. genomes using 0, 1, and 2 mismatches. Best results were obtained in both control groups (Negative and Non-parasitic lesion) using 0 mismatches, whereas 1 and 2 mismatches yielded a high percentage of unspecific mapping (Supplementary Figure S1). Furthermore, no differences between mapping with the E. multilocularis or E. granulosus s. s. genomes was observed.
Reads mapping to the human genome showed mean values of 38.3% (±13.1) and 56.7% (±9.0) for the AE and CE sets, respectively. With respect to Echinococcus spp. mapping percentages, mean values were 1.8% (±1.0) and 1.0% (±0.3) for the AE and CE samples, respectively ( Figure 1, Supplementary Table S1). Pairwise correlation coefficients were calculated within each patient group and two samples from the AE set and four from the CE set were excluded from further analyses due to low correlation (Supplementary Figure S2, Supplementary Table S1).
lowing alternate protocols, we continued with their analysis separately. Mean processed reads were ≥8.5 × 10 6 (±1.3 × 10 6 ) and 10.7 × 10 6 (±1.5 × 10 6 ) per group in AE and CE sets, respectively (Supplementary Table S1). Before proceeding to perform the sRNA profiling, the maximum number of mismatches to each reference genome was determined to allow the most specific and sensitive mapping pipeline. Due to the short length of the reads, sequences generated from highly conserved regions are likely to map to host and parasite genomes (ambiguous sequences); however, since the Echinococcus genomes still have a low quality compared to the human genome, a highly stringent mapping pipeline may exclude bona fide parasite sequences. Thus, reads from negative samples (patients with no detectable echinococcosis) were aligned to both the human and Echinococcus spp. genomes using 0, 1, and 2 mismatches. Best results were obtained in both control groups (Negative and Non-parasitic lesion) using 0 mismatches, whereas 1 and 2 mismatches yielded a high percentage of unspecific mapping (Supplementary Figure S1). Furthermore, no differences between mapping with the E. multilocularis or E. granulosus s. s. genomes was observed.
Reads mapping to the human genome showed mean values of 38.3% (±13.1) and 56.7% (±9.0) for the AE and CE sets, respectively. With respect to Echinococcus spp. mapping percentages, mean values were 1.8% (±1.0) and 1.0% (±0.3) for the AE and CE samples, respectively ( Figure 1, Supplementary Table S1). Pairwise correlation coefficients were calculated within each patient group and two samples from the AE set and four from the CE set were excluded from further analyses due to low correlation (Supplementary Figure S2, Supplementary Table S1).

Circulating Endogenous sRNA Profile in AE Patients
To obtain an overall vision of the similarity of the transcriptional profile within each group, a PCA was performed with the top 50 circulating sRNAs which showed that the three negative patients displayed similar sRNA patterns that clustered them together and apart from the rest of the samples, except for one AE positive patient ( Figure 2). The most abundant sRNA biotypes detected were tDRs (52.3%, 47.0-71.1) and miRNAs (28.8%, 17.3-33.9) (Figure 3). In addition, reads mapping to ribosomal RNA (rRNA), Y-RNA, and small nuclear RNA (snRNA), among others, were also identified. In tDRs, Y-RNAs, and snRNAs, the most abundantly detected sRNAs were generated from specific loci. In the case of tDRs, the 5p-half of tRNA Glu , tRNA Gly , and tRNA Val accounted for 95% of total tRNA-mapping reads (Figure 4, Supplementary Table S2). With respect to Y-RNAs, ≥60% of the reads in each patient generated from the 3 -end (position 70-93 nt, 5 CCCACUGCUAAAUUUGACUGGCUU3 ) of Y4 RNA (URS0000188F7D_9606), here called hY4-sRNA-3p (Supplementary Table S3). Finally, ≥75% of the snRNA-mapping reads in each sample corresponded to miR-1246 (position 94-116 nt, 5 AAAUGGAUUUU-UGGAGCAGGG 3 ) (Supplementary Table S3). This sRNA is a non-canonical miRNA that generates from the U2 snRNA (URS0000A90D33_9606 snRNA_AC024051.12). Interestingly, snRNAs were significantly upregulated in positive patients (p = 0.025).
of the reads in each patient generated from the 3′-end (position 70-93 nt, 5′CCCACUGCU-AAAUUUGACUGGCUU3′) of Y4 RNA (URS0000188F7D_9606), here called hY4-sRNA-3p (Supplementary Table S3). Finally, ≥75% of the snRNA-mapping reads in each sample corresponded to miR-1246 (position 94-116 nt, 5′ AAAUGGAUUUUUGGAGCAGGG 3′) (Supplementary Table S3). This sRNA is a non-canonical miRNA that generates from the U2 snRNA (URS0000A90D33_9606 snRNA_AC024051.12). Interestingly, snRNAs were significantly upregulated in positive patients (p = 0.025).     The differential expression in AE positive and AE treated patients with respect to the AE negative group was assessed for the 202 human miRNAs detected, tRNA Glu -5p, tRNA Gly -5p, tRNA Val -5p, hY4-sRNA-3p, and miR-1246/U2. Raw and normalized read counts are reported in Supplementary Tables S4 and S5. sRNAs were considered to be significantly altered if: (i) displayed ≥100 raw counts in ≥2 samples from at least one group and (ii) showed a significant ≥1.5-fold change (−0.6 ≥ log2 ≥ 0.6). As a result, four and nine sRNAs showed significantly altered levels in AE positive and AE treated patients, respectively ( Figure 5A,B). In both groups, miR-122-5p showed a 4-fold upregulation while miR-144-5p presented no expression. In agreement with the significant upregulation observed for snRNAs in AE positive patients, miR-1246/U2 showed a 4.6-fold change compared to the negative group. With respect to tDRs, tRNA Val -5p showed a 3.5-fold upregulation in AE positive patients while tRNA Gly -5p displayed a 7.1-fold downregulation in AE treated. miR-150-5p and miR-483-5p were upregulated (2.5× and 2.6×, respectively) in AE treated The differential expression in AE positive and AE treated patients with respect to the AE negative group was assessed for the 202 human miRNAs detected, tRNA Glu -5p, tRNA Gly -5p, tRNA Val -5p, hY4-sRNA-3p, and miR-1246/U2. Raw and normalized read counts are reported in Supplementary Tables S4 and S5. sRNAs were considered to be significantly altered if: (i) displayed ≥100 raw counts in ≥2 samples from at least one group and (ii) showed a significant ≥1.5-fold change (−0.6 ≥ log2 ≥ 0.6). As a result, four and nine sRNAs showed significantly altered levels in AE positive and AE treated patients, respectively ( Figure 5A,B). In both groups, miR-122-5p showed a 4-fold upregulation while miR-144-5p presented no expression. In agreement with the significant upregulation observed for snRNAs in AE positive patients, miR-1246/U2 showed a 4.6-fold change compared to the negative group. With respect to tDRs, tRNA Val -5p showed a 3.5-fold upregulation in AE positive patients while tRNA Gly -5p displayed a 7.1-fold downregulation in AE treated. miR-150-5p and miR-483-5p were upregulated (2.5× and 2.6×, respectively) in AE treated while miR-324-5p, miR-485-3p, and miR-374a-5p presented null expression in this group of patients. Finally, hY4-sRNA-3p showed a 2.5-fold downregulation in AE treated.

Circulating Endogenous sRNA Profile in CE Patients
Overall analysis of the extracellular transcriptomes of the CE set did not show any characteristic clustering of the analyzed patient groups (Supplementary Figure S3). As observed in the AE set, the most abundant sRNA biotypes corresponded to tDRs and miR-NAs, and reads mapping to rRNAs, Y-RNAs, and snRNAs, among others, were also detected ( Figure 6). In addition, the 5p-half of tRNA Glu , tRNA Gly , and tRNA Val were the most

Circulating Endogenous sRNA Profile in CE Patients
Overall analysis of the extracellular transcriptomes of the CE set did not show any characteristic clustering of the analyzed patient groups (Supplementary Figure S3). As observed in the AE set, the most abundant sRNA biotypes corresponded to tDRs and miRNAs, and reads mapping to rRNAs, Y-RNAs, and snRNAs, among others, were also detected ( Figure 6). In addition, the 5p-half of tRNA Glu , tRNA Gly , and tRNA Val were the most abundantly detected among tDRs, as well as hY4-sRNA-3p and miR-1246/U2, among Y-RNAs and snRNAs, respectively (Supplementary Tables S2 and S3). The differential expression of the 367 human miRNAs detected, together with tRNA Glu -5p, tRNA Gly -5p, tRNA Val -5p, hY4-sRNA-3p, and miR-1246/U2 was assessed for each group of CE patients with respect to the CE negative group. Moreover, patients with a single, non-parasitic lesion were analyzed. Raw and normalized read counts are reported in Supplementary Tables S6 and S7. Two alternative approaches were employed to identify altered expression levels. First, patients were grouped as active CE if harbored cysts were classified as CE1+2, CE3a or CE3b, and inactive CE if cysts were CE4 or CE5. The second approach consisted of considering each parasite stage separately. Thus, only those sRNAs that fulfilled the following criteria were considered: (i) detection by both strategies in all or all-except-one patient group, e.g., all CE groups except CE1.2 (Supplementary Figure S4); (ii) ≥100 raw counts in ≥50% (Strategy 1) or ≥2 samples in one group (Strategy 2), and (iii) significant ≥1.5-fold change (−0.6 ≥ log2 ≥ 0.6). As a result, nine differentially expressed genes were detected (Figure 7): miR-1246/U2, miR-671-5p, and miR-423-5p were upregulated in all the groups, miR-125b-5p was downregulated in all the groups except inactive patients, miR-192-5p was downregulated in active, inactive and treated patients, miR-1-3p was altered in patients with inactive cysts, miR-125a-5p and miR-590-3p were downregulated only in treated patients, and miR-431-5p was downregulated in NPL patients. Interestingly, miR-671-5p and miR-431-5p displayed no expression in CE negative and NPL patients, respectively. The differential expression of the 367 human miRNAs detected, together with tRNA Glu -5p, tRNA Gly -5p, tRNA Val -5p, hY4-sRNA-3p, and miR-1246/U2 was assessed for each group of CE patients with respect to the CE negative group. Moreover, patients with a single, non-parasitic lesion were analyzed. Raw and normalized read counts are reported in Supplementary Tables S6 and S7. Two alternative approaches were employed to identify altered expression levels. First, patients were grouped as active CE if harbored cysts were classified as CE1+2, CE3a or CE3b, and inactive CE if cysts were CE4 or CE5. The second approach consisted of considering each parasite stage separately. Thus, only those sRNAs that fulfilled the following criteria were considered: (i) detection by both strategies in all or all-except-one patient group, e.g., all CE groups except CE1.2 (Supplementary Figure S4); (ii) ≥100 raw counts in ≥50% (Strategy 1) or ≥2 samples in one group (Strategy 2), and (iii) significant ≥1.5-fold change (−0.6 ≥ log2 ≥ 0.6). As a result, nine differentially expressed genes were detected (Figure 7): miR-1246/U2, miR-671-5p, and miR-423-5p were upregulated in all the groups, miR-125b-5p was downregulated in all the groups except inactive patients, miR-192-5p was downregulated in active, inactive and treated patients, miR-1-3p was altered in patients with inactive cysts, miR-125a-5p and miR-590-3p were downregulated only in treated patients, and miR-431-5p was downregulated in NPL patients. Interestingly, miR-671-5p and miR-431-5p displayed no expression in CE negative and NPL patients, respectively.

Parasite sRNAs
Due to the low proportion of parasite-derived sequences detected (Figure 1), parasite sRNAs were searched for in all patient samples. Most abundant Echinococcus reads mapped to rRNAs and tRNAs (Tables 3 and 4). Further manual inspection of non-miRNA sequences demonstrated that they present low complexity and/or high identity with other organisms denoting they cannot be regarded as Echinococcus-specific but most probably correspond to the host. These types of sequences are considered of low specificity to determine their origin in small RNA sequencing experiments [58] and consequently were excluded from further analyses. With respect to miRNAs, in the AE set, only emu-miR-87-3p was detected in one AE positive and two AE treated patients (Table 3). Surprisingly, in the CE set, egr-miR-87-3p was also detected but in one CE negative and two non-parasitic lesion patients. No parasite miRNAs were detected in CE positive or CE treated samples. Of note, miR-87-3p is not expressed in vertebrates [59] and presents high sequence conservation among helminths (Supplementary Figure S5).

Parasite sRNAs
Due to the low proportion of parasite-derived sequences detected (Figure 1), parasite sRNAs were searched for in all patient samples. Most abundant Echinococcus reads mapped to rRNAs and tRNAs (Tables 3 and 4). Further manual inspection of non-miRNA sequences demonstrated that they present low complexity and/or high identity with other organisms denoting they cannot be regarded as Echinococcus-specific but most probably correspond to the host. These types of sequences are considered of low specificity to determine their origin in small RNA sequencing experiments [58] and consequently were excluded from further analyses. With respect to miRNAs, in the AE set, only emu-miR-87-3p was detected in one AE positive and two AE treated patients (Table 3). Surprisingly, in the CE set, egr-miR-87-3p was also detected but in one CE negative and two non-parasitic lesion patients. No parasite miRNAs were detected in CE positive or CE treated samples. Of note, miR-87-3p is not expressed in vertebrates [59] and presents high sequence conservation among helminths (Supplementary Figure S5).

Diagnostic Potential of Endogenous Circulating sRNAs
To explore the diagnostic potential of the differentially expressed sRNAs in the context of AE and CE, correlation analyses between the level of the circulating sRNAs and serology results were performed (Supplementary Tables S8 and S9). In AE, significant correlations were observed only for miR-122-5p and miR-1246/U2, with strong positive associations in both cases ( Figure 8A). In CE, no significant correlations were detected.

Discussion
In this work, an in-depth profiling of the circulating sRNA transcriptome in the context of AE and CE was performed. The different biotypes of extracellular sRNAs were identified including both endogenous (human) and parasitic. With respect to endogenous sRNAs, general profiles in both AE and CE demonstrated a highly heterogeneous response that precluded a clear clustering of each group of patients except for AE negative. This could in part be related to the limited cohort sizes of the analyzed groups and the fact that only female samples were analyzed for AE and mixed genders for CE; however, specific differentially expressed sRNAs for each disease were identified. Many of these sRNAs have shown altered expression in other liver pathologies. In this sense, circulating miR-1246/U2 is upregulated in hepatocellular carcinoma [60] and constitutes a non-canonical miRNA that generates from the U2 snRNA in a DROSHA and DICER independent manner [61]. As has been reported elsewhere [62], reads associated with this sRNA do not map to the precursor miRNA sequence deposited in miRBase and hence, the miRDeep pipeline cannot detect it. miR-1246/U2 is mainly considered an oncomiR due to its promoting effect on the regulation of cellular processes leading to the generation of multiple types of cancer [26,61]. Regarding miR-122-5p and miR-192-5p, they present a liver-enriched expression [63,64], are involved in pathways related to hepatic metabolism and development [28,63], and display altered circulating levels in multiple liver pathologies [27,28,65]. Circulating miR-423-5p and miR-144-5p showed altered levels in patients with Chronic Hepatitis B Virus infection [66]. Finally, miR-150-5p and miR-125a-5p, two miR-NAs related to the regulation of the immune response, were differentially expressed in AE and CE treated patients, respectively. miR-150-5p is enriched in lymph nodes and spleen [64] and regulates B cell differentiation [67], while miR-125a-5p is involved in the regulation of inflammatory processes [68] and presents decreased circulating levels in the serum of patients with chronic inflammation [69]. Further studies will be required to In CE patients, the correlation with cyst size was also analyzed to explore whether any of the sRNAs varied its expression with this parameter. In this sense, only miR-1246/U2 and miR-423-5p displayed significant positive correlations ( Figure 8B, Supplementary Table S10).

Discussion
In this work, an in-depth profiling of the circulating sRNA transcriptome in the context of AE and CE was performed. The different biotypes of extracellular sRNAs were identified including both endogenous (human) and parasitic. With respect to endogenous sRNAs, general profiles in both AE and CE demonstrated a highly heterogeneous response that precluded a clear clustering of each group of patients except for AE negative. This could in part be related to the limited cohort sizes of the analyzed groups and the fact that only female samples were analyzed for AE and mixed genders for CE; however, specific differentially expressed sRNAs for each disease were identified. Many of these sRNAs have shown altered expression in other liver pathologies. In this sense, circulating miR-1246/U2 is upregulated in hepatocellular carcinoma [60] and constitutes a non-canonical miRNA that generates from the U2 snRNA in a DROSHA and DICER independent manner [61]. As has been reported elsewhere [62], reads associated with this sRNA do not map to the precursor miRNA sequence deposited in miRBase and hence, the miRDeep pipeline cannot detect it. miR-1246/U2 is mainly considered an oncomiR due to its promoting effect on the regulation of cellular processes leading to the generation of multiple types of cancer [26,61]. Regarding miR-122-5p and miR-192-5p, they present a liver-enriched expression [63,64], are involved in pathways related to hepatic metabolism and development [28,63], and display altered circulating levels in multiple liver pathologies [27,28,65]. Circulating miR-423-5p and miR-144-5p showed altered levels in patients with Chronic Hepatitis B Virus infection [66]. Finally, miR-150-5p and miR-125a-5p, two miRNAs related to the regulation of the immune response, were differentially expressed in AE and CE treated patients, respectively. miR-150-5p is enriched in lymph nodes and spleen [64] and regulates B cell differentiation [67], while miR-125a-5p is involved in the regulation of inflammatory processes [68] and presents decreased circulating levels in the serum of patients with chronic inflammation [69]. Further studies will be required to determine whether these miRNAs could be sensitive markers of treatment efficacy indicating the inflammatory response triggered by the spillage of parasitic antigens.
Interestingly, miR-1246/U2 and miR-122-5p were found in the metacestode inner fluid of E. multilocularis grown in vitro in co-culture with rat hepatoma cells [34]. Since the laminated layer of this parasite prevents an efficient release of Evs to the extra-parasite milieu, we hypothesize that a similar situation occurs with host Evs in the opposite direction. This would imply that these miRNAs are secreted in small Evs or associated to non-vesicular carriers, i.e., proteins or lipoproteins. In line with this, evidence shows that miR-1246/U2 is mainly secreted in small non-vesicular nanoparticles (~30 nm diameter) [62], while miR-122-5p is detected in EV and non-EV carriers in serum and plasma [70,71]. With respect to the remaining sRNAs, association to non-vesicular carriers was proposed for miR-671-5p, miR-423-5p, miR-144, and miR-1-3p [62,71,72], while miR-192 and miR-150-5p were related to both vesicular and non-vesicular carriers [70]. Thus, since extracellular miRNAs are secreted through alternative pathways, a priori assumptions that lead to EV isolation for miRNA enrichment can hinder sRNA detection.
sRNAs other than miRNAs, such as those derived from tRNAs and Y-RNAs, were also differentially expressed during AE. The identified sRNAs derived from the 5'half of tRNA Gly and tRNA Val , and from the 3 end of hY4-RNA. In accordance with previous reports, these types of sRNAs are highly abundant in the bloodstream [25,73] and an immunomodulatory role has been proposed for both [73,74]. Furthermore, tRNA Val -derived sRNAs are enriched in liver tissue subjected to stress conditions [75] and together with 5 -tRNA Gly -derived sRNAs, were shown to be upregulated in the livers of patients with chronic viral hepatitis compared to uninfected controls [76], suggesting a common role in infectious liver diseases.
In both AE and CE treated groups, patients did not restore overall normal levels of endogenous circulating sRNAs, which may be related to the fact that samples were taken during albendazole treatment. However, miR-1246/U2 and tRNA Val -5p, which were upregulated in AE positive patients, did not present significant differences in AE treated patients with respect to the negative group, positioning them as candidate treatment follow-up biomarkers. Nevertheless, further analyses are required to determine whether the sRNAs here detected present a more sensitive and specific response than serology tests for this purpose.
The candidate markers identified in this work were not detected in previous sRNA high-throughput studies of human CE [45,46] since different sample type (e.g., blood), technology and/or analysis criteria were used. In the mentioned articles, PCR arrays were employed and thus, results were dependent on the composition of each selected plate, where a limited number of miRNAs were amplified. This reinforces the need of a population study to determine whether the reported differences relate to either methodological issues or to the characteristics of the analyzed patients (sample type, age, gender, parasite isolate, co-morbidity) since limited cohort sizes were used in this work and the other mentioned studies.
With respect to pathogen sRNAs, a single miRNA (miR-87-3p) was detected in 50% of AE patients. This miRNA presents a high sequence conservation with orthologues from other worm species; it is secreted by both round and flatworms in vitro [31] and was detected in the sera of patients infected with Onchocerca volvulus [77]. Thus, the sole detection of circulating miR-87-5p may not be pathognomonic of E. multilocularis but of worm infection. The fact that only one miRNA was detected suggests that in case E. multilocularis secretes or releases other extracellular RNAs in vivo, they are not abundant enough to be detected or they are carried in a vehicle that cannot efficiently reach the bloodstream. Previously, we described sRNA secretion of active (viable) and transitional (senescent) in vitro cultures of E. multilocularis metacestodes and observed that miR-87-3p is found in the non-vesicular fraction of the culture medium, i.e., this miRNA is most likely secreted in soluble ribonucleoprotein complexes or associated to lipoproteins [34]. In accordance with the detection of this miRNA in 2 out of 3 of the treated patients, all the miRNAs secreted in vitro displayed higher secretion levels in the transitional cultures corresponding to parasites with compromised tegument integrity [34]. Contrarily, in CE patients, no parasite miRNAs were detected. Previously, high throughput characterizations of the profiles of circulating sRNAs were performed in experimental AE [43], experimental CE [44], and human CE [45,46]. However, parasite miRNAs were only studied in the AE experiment where authors inoculated mice with protoscoleces and identified 7 circulating miRNAs which did not include emu-miR-87-3p.
Overall, here we propose a panel composed of 20 sRNAs (Figure 9) which include markers for general liver lesion, AE, CE, inactive CE, helminth presence, and differential diagnosis with other non-parasitic hepatic lesions. This work constitutes an exploratory assay and the sRNA panel described should be further tested in a higher number of samples to assess the diagnostic performance in a population study. Nonetheless, the proposed set of sRNAs represent novel candidate biomarkers that could provide useful information for medical decisions in cases when current diagnostic procedures are inconclusive.
Biology 2023, 12, x FOR PEER REVIEW 16 of 21 secreted in vitro displayed higher secretion levels in the transitional cultures corresponding to parasites with compromised tegument integrity [34]. Contrarily, in CE patients, no parasite miRNAs were detected. Previously, high throughput characterizations of the profiles of circulating sRNAs were performed in experimental AE [43], experimental CE [44], and human CE [45,46]. However, parasite miRNAs were only studied in the AE experiment where authors inoculated mice with protoscoleces and identified 7 circulating miR-NAs which did not include emu-miR-87-3p. Overall, here we propose a panel composed of 20 sRNAs (Figure 9) which include markers for general liver lesion, AE, CE, inactive CE, helminth presence, and differential diagnosis with other non-parasitic hepatic lesions. This work constitutes an exploratory assay and the sRNA panel described should be further tested in a higher number of samples to assess the diagnostic performance in a population study. Nonetheless, the proposed set of sRNAs represent novel candidate biomarkers that could provide useful information for medical decisions in cases when current diagnostic procedures are inconclusive. Figure 9. Endogenous and parasitic sRNAs differentially expressed in each patient group with respect to negative patients. * Parasitic microRNA. For endogenous sRNAs, only those displaying ≥100 raw counts, ≥1.5 fold change, and p adjusted values ≤ 0.05 were considered. NPL: Non-parasitic lesion. Green sRNA name: upregulated expression. Red sRNA name: downregulated expression. Black and gray dots: indicate presence or absence, respectively, in the group described in the corresponding row; connecting line: indicates intersection between groups with black dots and the number on a bar indicates the number of intersecting sRNAs.

Conclusions
Here, AE and CE novel circulating candidate markers have been identified, including sRNAs related to liver physiology and immunomodulatory processes. The RNA biotypes detected are not restricted to miRNAs, but also comprise sRNAs derived from tRNAs and a Y-RNA. The proposed biomarkers can be classified into different categories as indicative of (i) non-parasitic, AE and/or CE liver lesions; (ii) AE; (iii) CE; (iv) inactive CE, and (v) non-parasitic liver lesion. Furthermore, a circulating parasitic miRNA was detected in AE patients. Overall, our results provide an in-depth characterization of the effect that E. . Endogenous and parasitic sRNAs differentially expressed in each patient group with respect to negative patients. * Parasitic microRNA. For endogenous sRNAs, only those displaying ≥100 raw counts, ≥1.5 fold change, and p adjusted values ≤ 0.05 were considered. NPL: Non-parasitic lesion. Green sRNA name: upregulated expression. Red sRNA name: downregulated expression. Black and gray dots: indicate presence or absence, respectively, in the group described in the corresponding row; connecting line: indicates intersection between groups with black dots and the number on a bar indicates the number of intersecting sRNAs.

Conclusions
Here, AE and CE novel circulating candidate markers have been identified, including sRNAs related to liver physiology and immunomodulatory processes. The RNA biotypes detected are not restricted to miRNAs, but also comprise sRNAs derived from tRNAs and a Y-RNA. The proposed biomarkers can be classified into different categories as indicative of (i) non-parasitic, AE and/or CE liver lesions; (ii) AE; (iii) CE; (iv) inactive CE, and (v) non-parasitic liver lesion. Furthermore, a circulating parasitic miRNA was detected in AE patients. Overall, our results provide an in-depth characterization of the effect that E. multilocularis and E. granulosus s. l. exert on the extracellular sRNA landscape in human infections.